91% found this document useful (11 votes)
3K views437 pages

Patil S. Elements of Modern Physics 2021

Uploaded by

steam hotrice
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
91% found this document useful (11 votes)
3K views437 pages

Patil S. Elements of Modern Physics 2021

Uploaded by

steam hotrice
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 437

S. H.

Patil

Elements of
Modern Physics
Elements of Modern Physics
S. H. Patil

Elements of Modern Physics

123
S. H. Patil
Department of Physics
Indian Institute of Technology Bombay
Mumbai, India

ISBN 978-3-030-70142-0 ISBN 978-3-030-70143-7 (eBook)


https://doi.org/10.1007/978-3-030-70143-7
Jointly published with ANE Books Pvt. Ltd.
In addition to this printed edition, there is a local printed edition of this work available via Ane Books in
South Asia (India, Pakistan, Sri Lanka, Bangladesh, Nepal and Bhutan) and Africa (all countries in the
African subcontinent).

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publishers nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publishers remain neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Dedicated
to
My Parents
Preface

This book has been thoroughly revised and updated as per the requirement of
the students. The book provides a perspective of the important concepts and
applications in contemporary physics.
While modern physics developing so rapidly, there is a constant need to
revise and update the presentation. The present book tries to do this. Starting
with a discussion of special theory of relativity and quantum theory, it
describes their applications to atoms, molecules, solids and nuclei. There are
two special chapters on the modern description of elementary particles and
on general theory of relativity and cosmology. The emphasis is on a logical
development of ideas, and historical aspects are referred to mainly as an aid
to this. An effort has been made to maintain rigour analytical discussions and
precision in descriptions. It is hoped that the book will be useful to an advanced
undergraduate student, and as a review to a graduate student.
I am grateful to my colleagues, Dr. S.M. Bharati. Dr. S.M. Chitre,
Dr. P.P. Divakaran, Dr. Y.K. Gambhir, Dr. G.V. Dass, Dr. Dipan K. Ghosh, Dr.
K.S. Kulkarni, Dr. R.C. Mehrotra, Dr. C.H. Mehta, Dr. G. Mukhopadhyay, Dr.
R.S. Patil, Dr. G. Thyagarajan and Prof. Atul Mody, Dept. of Physics, VES,
College of Arts, Science and Commerce, Mumbai who ungrudgingly gave
me their valuable time in reading parts of the manuscript and made valuable
suggestions. I also thank Mr. Sunil Somalwar for going through a part of the
manuscript.
Mr. S.B Modak not only provided accurate typing but also executed
the entire organization of the book with the help of Mr. D.S. Nakhawa,
Mr. Kashipathy and Mr. C.A. Sarmalkar. I owe them gratitude. I acknowledge
financial support from the curriculum development programme of IIT, Bombay.
S H Patil

vii
Fundamental Constants

c
= 2.997925 × 108 m/s
h
= 6.6256 × 10–34 Js
me = 9.109 × 10–31 kg = 0.511 MeV/c2
e
= 1.60206 × 10–19 C
k
= 1.38044 × 10–23 J/K
mp = 938.211 MeV/c2
mn = 939.505 MeV/c2
ε0 = 8.85434 × 10–12 F m or C2/Nm2
µ0 = 4π × 10–7 H/m or N/A2
N = 6.022 × 1026/kmol. number of atoms in 12 kg of 12C

viii
Contents

1. Special Theory of Relativity 1–30


1.1 Inertial Frames of Reference 2
1.2 Galilean Transformations 2
1.3 Velocity of Light 3
1.4 Postulates of Special Relativity 5
1.5 Lorentz Transformations 6
1.6 Simultaneity and Time Dilation 8
1.7 Length Contraction 11
1.8 Transformation of Velocities 12
1.9 Lorentz Four-Vectors 14
1.10 Energy-Momentum Four-Vector and Relativistic Dynamics 16
1.11 Electromagnetic Interaction 18
1.12 Zero-Mass Particles and Doppler Shift 21
1.13 Examples 23
Problems 28

2. Introduction to Quantum Ideas 31–64


2.1 Black-Body Radiation 32
2.2 Photoelectric Effect 37
2.3 Compton Effect 40
2.4 Wave Nature of Particles 43
2.5 Atomic Spectra 46
2.6 Nuclear Model of the Atom 49
2.7 Bohr Model 51
2.8 Examples 56
Problems 61

3. Elements of Quantum Theory 65–100


3.1 A Thought Experiment 66
3.2 The Wave Function 67
3.3 Postulates of Quantum Mechanics 70

ix
x Elements of Modern Physics

3.4 Some Properties of Observables and Wave Functions 72


3.5 Free Particle 74
3.6 Wave Packet and the Uncertainty Principle 76
3.7 Step Potential 78
3.8 Particle in a Box 83
3.9 Simple Harmonic Oscillator 87
3.10 Small Perturbations 89
3.11 Angular Momentum 90
3.12 Examples 94
Problems 98

4. The One-Electron Atom 101–130


4.1 Solutions of the Schrödinger Equation 102
4.2 Electron Spin 107
4.3 Total Angular Momentum 109
4.4 Fine Structure of One-Electron Atomic Spectra 110
4.5 Hyperfine Structure 115
4.6 Examples of One-Electron Atoms 118
4.7 Schrödinger Equation for Spin 1/2 Particles 120
4.8 Dirac Equation 122
4.9 Examples 125
Problems 129

5. Atoms and Molecules 131–172


5.1 Exchange Symmetry of Wave Functions 132
5.2 Shells and Subshells in Atoms 135
5.3 Periodic Table 137
5.4 Atomic Spectra 142
5.5 X-ray Spectra 151
5.6 Molecular Bonding 159
5.7 Molecular Spectra 162
5.8 Examples 166
Problems 171

6. Interaction with External Fields 173–208


6.1 The Hamiltonian 174
6.2 Atoms in a Magnetic Field 175
6.3 Interaction with Radiation 181
6.4 Spontaneous Transitions 184
Contents xi

6.5 Lasers and Masers 188


6.6 Applications of Lasers 192
6.7 Some Experimental Methods 197
6.8 Examples 203
Problems 207

7. Quantum Statistics 209–253


7.1 Distinguishable Arrangements 210
7.2 Statistical Distributions 213
7.3 Applications of Maxwell-Boltzmann Distribution 218
7.4 Applications of Bose-Einstein Distribution 224
7.5 Applications of Fermi-Dirac Distribution 232
7.6 Superconductivity 238
7.7 Examples 246
Problems 251

8. Solid State Physics 255–316


8.1 Binding Forces in Solids 256
8.2 Crystal Structures 260
8.3 Band Theory of Solids 267
8.4 Semiconductors 274
8.5 Semiconductor Devices 283
8.6 Magnetic Properties 292
8.7 Dielectric Properties 302
8.8 Examples 308
Problems 313

9. The Nucleus 317–364


9.1 Properties of the Nucleus 318
9.2 Nuclear Forces 325
9.3 Models of the Nucleus 328
9.4 Weizsacker’s Mass Formula 336
9.5 Nuclear Stability 337
9.6 Nuclear Reactions 345
9.7 Fission Reactors 350
9.8 Thermonuclear Fusion 355
9.9 Examples 358
Problems 362
xii Elements of Modern Physics

10. Elementary Particles 365–390


10.1 Elementary Particles 366
10.2 Strong Interaction 368
10.3 Electromagnetic Interaction 372
10.4 Weak Interaction 373
10.5 Unified Approach 379
10.6 Production and Detection of Particles 380
10.7 Examples 387
Problems 390

11. General Relativity and Cosmology 391–418


11.1 Frames of Reference 392
11.2 Curved Space-Time 395
11.3 Schwarzschild Metric 401
11.4 Kinematics of the Universe 406
11.5 Dynamics of the Universe 409
11.6 The Early Universe 411
11.7 Examples 415
Problems 417
Reference 419–422
Answers to Problems 423–428
Index 429–432
1
Special Theory of Relativity

Structures of the Chapter


1.1 Inertial frames of reference
1.2 Galilean transformations
1.3 Velocity of light
1.4 Postulates of special relativity
1.5 Lorentz transformations
1.6 Simultaneity and time dilation
1.7 Length contraction
1.8 Transformation of velocities
1.9 Lorentz four-vector
1.10 Energy-momentum four-vector and relativistic dynamics
1.11 Electromagnetic interaction
1.12 Zero-mass particles and Doppler shift
1.13 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 1


S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_1
2 Elements of Modern Physics

We begin our discussion of modern physics with the theory of relativity which
aims at relating the observations made by observers in relative motion with
respect to each other. Here only the restrictive case of the special theory of
relativity is analysed, in which the observers are moving with constant velocity
with respect to each other. This will help in choosing appropriate frames of
reference and in presenting the later topics in a unified manner. After a brief
consideration of the drawbacks of the classical theory, the main results of the
special theory of relativity are obtained, and applied to describe some specific
physical situations.

1.1 INERTIAL FRAMES OF REFERENCE


Most physical observations describe the behaviour of certain objects in space
as a function of time. Since the position of a body can be stated only relative to
some other bodies, the description of these observations requires a frame of
reference which is a technical term for the combination of a set of spatial
coordinate axes and a time variable.
It was realised by Galileo and others, that the form of the laws of nature
depends on the choice of the frame of reference. Among all the possible frames
of reference, there exists a class called the inertial frames of reference, in
which these laws take a simple form. Inertial frames of reference are those in
which a body that is not acted upon by external forces, moves with constant
velocity. It is implicit here that if two reference frames move with constant
velocity with respect to each other, and one of them is inertial, the other also is
an inertial frame. It was found that the laws of mechanics take on the same
form in all inertial frames of reference.

1.2 GALILEAN TRANSFORMATIONS


Consider to inertial frames of reference F and F′, such that their coordinate
axes coincide at t = 0, and F′ moves with velocity v along the x-axis with
respect to F. Then, it may be expected that the coordinates in the two frames
are related by the equations
t′ = t
x′ = x – yt (1.1)
y′ = y
z′ = z
called Galilean transformations. In writing these relations, it is assumed that
(i) it is possible to define a time t which is the same for all inertial frames of
reference, and (ii) the distance between two points is independent of the frames
of reference.
Special Theory of Relativity 3

For Galilean transformations, it is easy to show that the velocities and


accelerations in the two frames are related by
u′ = u–v (1.2)
where v is along the x-direction, and
a′′ = a (1.3)
respectively. Then, if the interaction potential V is a function of only the distances
between particles, Newton’s equations in the two frames are:
miai = ∇iV
miai′ = ∇i′V (1.4)
where the subscript i is the particle index. These equations are related by the
transformations (1.1) and are of the same form. However, it was observed that
the Galilean transformations are not consistent with the dynamical theory of
electromagnetic fields as formulated by Maxwell (1865).

1.3 VELOCITY OF LIGHT


It follows from Maxwell’s equations for electromagnetic fields that
electromagnetic waves travel in vacuum with a speed equal to the ratio of the
electromagnetic unit to the electrostatic unit of charge. This ratio is essentially
equal to the speed of light so that light itself is taken as a form of electromagnetic
radiation.
Now, how does the velocity of light transform from one inertial frame to
another? According to Galilean transformations, the velocities are different in
different frames and are related by Eq. (1.2). However, Maxwell’s equations
have no reference to the velocity of the inertial frame and hence imply that the
speed of light is independent of the velocity of the inertial frame. Observationally
also, the Michelson-Morley experiment (1887) analysed below suggests that
the speed of light is independent of the velocity of the inertial frame.
Suppose, the earth is moving with velocity v in the x-direction with respect
to the ‘standard’ frame in which the velocity of light is c in all directions. Then
according to Eq. (1.2), the velocity of light with respect to an observer on Earth
is c–v. The time taken for light to travel along the limb AB of the interferometer
(Fig. 1.1), from A to B and back is
l1 l
t1 = + 1 (1.5)
c−v c + v
While travelling from A to C and back, the velocity c–v is parallel to AC and
hence perpendicular to v. Therefore
c.v = v 2 (1.6)
4 Elements of Modern Physics

and the magnitude of c–v is (c2–v2)½, so that the time taken for light to travel
from A to C and back, is given by
2l2
t2 = (1.7)
( c 2 − v 2 )1 / 2
Thus the difference in the two times is

Fig. 1.1 Schematic diagram of the Michelson-Morley experiment.

2l1c 2l
∆ = t1 − t2 = 2 2
− 2 22 1/ 2 (1.8)
c −v (c − v )
If the apparatus is turned through 90°, the roles of l1 and l2 are interchanged,
and the difference in the times becomes
2l1 2l c
∆′ = t1′ − t2′ = 2 2 1/ 2
− 2 2 2 (1.9)
(c − v ) c −v
The expected shift in the interference fringe at D, is
c ( ∆′ − ∆ )
δ=
λ
2(l1 + l2 )  1 1 
=  − 2
λ 2 2 1/ 2
 (1 − v / c )
2
1− v / c 

(l1 + l2 )  v 2 
≈−   for v >> c (1.10)
λ  c2 
In the experiment of Michelson and Morley, l2 + l 2 was 22 m, and
λ = 5.9 × 10–7 m. The value of v is at least of the order v ≈ 30 km/s corresponding
to the velocity of the earth’s motion around the sun, even if the motion of the
solar system around the galactic centre is ignored. For these values
δ ≈ 0.37 (1.11)
Special Theory of Relativity 5

No such shift was observed in the experiment.


The above result is based on Eq. (1.2) for the transformation of the velocity
of light, which was derived from the Galilean transformations (1.2). An attempt
was made to salvage the Galilean transformations by postulating that a hypothetical
medium called either, responsible for the propagation of light, is dragged along
by the earth as it moves in space. Then the speed of light with respect to an
observer on the earth would remain unaffected by the motion of the earth,
analogous to the speed of sound in which case the air is dragged along. This
would explain the null result of the Michelson-Morley experiment, but this is in
conflict with the observed aberration of starlight received on the earth. It is
found that in order to observe a star, the telescope should be titled in the direction
of the earth’s velocity (Fig. 1.2). This tilt would not be needed if the
light-propagating ether was dragged along by the earth. Actually, what is observed
is not the absolute tilt but the variation of the tilt as the earth changes it velocity
along its orbit around the sun. The either-based explanation became even less
tenable after Lodge (1892) showed that the velocity of light is unaffected in the
vicinity of rapidly rotating bodies.

Fig. 1.2 In order that starlight passes along a telescope moving with velocity v,
the telescope should be tilted at an angle of α = v/c.

1.4 POSTULATES OF SPECIAL RELATIVITY


Einstein (1905) proposed a radically different but, in retrospect, a simple approach
to the problem posed by the Michelson-Morley experiment. He started with the
principle of relativity but also postulated that the speed of light is the same in all
inertial frames of reference, thus giving it the status of a physical law. This
immediately explains the Michelson-Morley result but requires that the Galilean
6 Elements of Modern Physics

transformations (1.2) be discarded. He found that space and time are related in
an intimate manner and should be treated on an equal basis. Their relation has a
far-reaching influence on the laws of physics. We begin the discussion of
Einstein’s results with a formal statement of the postulates of the special theory
of relativity.
1. The laws of nature are of the same form in all inertial frames of reference.
2. The speed of light is the same in all inertial frames of reference, and is
independent of the motion of the source.
It is implicit in the first postulate that, since the coordinates of the different
inertial frames are related, the laws of nature written in the various inertial
frames can be deduced from one another. It also follows that the Galilean
transformations (1.2) relating the coordinates of the inertial frames, cannot be
right since they would imply that the speed of light is different in different inertial
frames, in contradiction to the second postulate. Hence, a more general relation
between the coordinates must be obtained, which incorporates the information
that the speed of light is the same in all inertial frames.

1.5 LORENTZ TRANSFORMATIONS


In deriving the transformation equations consistent with the postulates of the
special theory of relativity, it was assumed that space is homogeneous, i.e., that
all points in space and time are equivalent. This means that the separation between
space-time points should remain invariant under translations which implies that
the relations between the coordinates of different inertial frames should be linear.
Let us consider again the inertial frames F and F ′ mentioned in Sec. 1.2,
whose axes coincide at time t = t′ = 0, and F ′ moves with velocity v along the
x-axis with respect to F. It is assumed that the y-axis is perpendicular to the
x′-axis since otherwise the inclinations of the positive and negative y-axis with
respect to the x-axis would be different, violating the rotational (or alternatively,
left-right) symmetry about the direction of relative velocity. It is also assumed
that the y- and z-axes are orthogonal to each other in either of the frames of
reference. Finally, since the lengths of two rods, which are at rest in frames
F and F′ respectively, and which are perpendicular to the x-axis, can be compared
while they are passing each other,
y′ = y
z′ = z (1.12)
in order that the relations between F and F′ be reciprocal. For the transformation
of the x-coordinate, it is noted that the origin of F ′ travels with velocity v with
respect to frame F, which implies that
x′ = α(x – vt) (1.13)
Special Theory of Relativity 7

Taking into account the possibility that time may not be a universal variable,
t′ = γ (t–βx) (1.14)
is for the transformation of the time coordinate.
Let an electromagnetic signal be emitted at t = 0, from the origin of F, which
also coincides with the origin of F′ at that time. Since the speed of light is the
same in all inertial frames, the wavefront is described in the two frames by the
equations
x2 + y2 + z2 = c2t2 (1.15)
and x′2 + y′2 + z′2 = c2t′2 (1.16)
respectively. Substituting Eqs. (1.12) - (1.14) in Eq. (1.16)
(α2 –c2γ2β2) x2 + y2 + z2 = (c2γ2– v2α2) t2 + 2 (vα2 + c2β2) xt
(1.17)
This relation is consistent with Eq. (1.15) provided
α2 – c2γ2β2 = 1
v2 2
γ2 − α =1 (1.18)
c2
vα2 = c2βγ2 = 0
These equations are solved by first eliminating α2 by using the third
equation, and then eliminating γ. The final solutions are:

v2
β=
c2
1
α=γ=

1/ 2 (1.19)
v2 
1 − 2 
 c 

which lead to the transformations


x − vt
x′ = 1/ 2
 v2 
1 − 2 
 c 
y′ = y
z′ = z (1.20)
v
t− x
t′ = c2
1/ 2
 v2 
 1 − 2 
 c 
8 Elements of Modern Physics

These are the celebrated Lorentz equations. In obtaining them, the


positive root has been chosen, so that v = 0 implies x′ = x, t′ = t. It is also noted
v
that for small and x << ct , the Galilean transformations x′ = x – vt and
c
t′ = t are recovered. The transformations in Eq. (1.20) can be inverted to express
x, y, z and t in terms of x′, y′, z′ and t′
x′ + vt ′
x= 1/ 2
; y = y′; z = z′
 v2 
1 − 2  (1.21)
 c 

v
t′ + 2
x′
t= c
1/ 2
 v2 
1 − 2 
 c 
From these equations, it is seen that frame F moves with velocity –v with
respect to F′ so that the relative velocities of the frames are reciprocal.
It should be noted that the deviations of the Lorentz transformations from
v x
the Galilean transformations are second order in or and hence the
c ct
experiments which can test Lorentz transformations must be accurate enough
to detect these second order terms. The Michelson-Morley experiment did have
such an accuracy and could prove the inadequacy of Galilean transformations.
Lorentz transformations, though they differ only slightly from Galilean
transformations in most physical situations, bring in a profoundly new concept in
the kinematics of the universe. They remove the universal character of time
and treat it on the same footing as space coordinates. They require that physical
space be treated as a 4-dimensional space of space and time coordinates. As
might be expected, this mixing of space and time coordinates leads to some
unfamiliar consequences. A few of them are discussed here.

1.6 SIMULTANEITY AND TIME DILATION


It follows from Lorentz relations (1.20) that events which are simultaneous in
frame F but take place at different positions are not simultaneous in frame F′.
For example, if two events take place in frame F at
t 1 = t2 = 0 (1.22)
Special Theory of Relativity 9

but at positions x1 = 0 and x2 = l respectively, the corresponding events in frame


F′ take place at
t 1′ = 0
vl
t2′ = − 1/ 2 (1.23)
2 v2 
c 1 − 2 
 c 
i.e., the event at x2 = l took place earlier in frame F′. In order to understand this
result a little better, let the two events correspond to emission of light signals
l l
which travel towards the point x = which they will reach at t = s.
2 2c
However, as observed from frame F′, by the time these signals reach the
midpoint, the midpoint would have travelled some distance towards the point
x = 0 and away from the point x = l. Therefore, the signal from x = l travels
l
through a longer distance. Because the two signals reach the point x = at the
2
same time, according to an observe in frame F′, the event at x = l must have
taken place at an earlier time and the events at x = 0 and x = l are not simultaneous.
He will also observe that since the local clocks at x = 0 and x = l, stationary in
frame F, show time t = 0 when the events took place, the F clock at x = l is
ahead of the F clock at x = 0.
With simultaneity being no longer a universal concept, it is necessary to set
up a system of synchronised clocks to measure the time coordinates of events
taking place at different positions. Consider now two clocks at rest in frame F,
located at a distance l apart. Let a clock at rest in the moving frame F′ record
time t0′ and t1′ when it passes the clocks of frame F. The corresponding times
recorded by the clocks in frame F may be designated by t0 and t1 . Since these
observations correspond to observations taking place at the same place in frame
F′, one obtains from Eq. (1.21)
t ′ − t 0′
t1 − t0 = 1 1/ 2
 v2 
1 − 2  (1.24)
 c 
Thus, the moving clock appears to run at a slower rate. However, as
discussed before, the observer in frame F′ will find that the clocks in frame F
are not synchronised and the clock he passes later is ahead of the clock he
passed earlier by an amount δ. After taking this into account, the reciprocal
nature of the two frames then requires that
(t − δ) − t0
t1′ − t0′ = 1 1/ 2
 v2 
 1 − (1.25)
2 
 c 
10 Elements of Modern Physics

Also, since the relative velocity of the frames is v, t1– t0 = l/v which leads to
1/ 2
l  v2 
t1′ − t0′ = 1 − 
v  c2 

1
−δ
t1′ − t0′ = v
1/ 2
 v2  (1.26)
1 − 2 
 c 

Therefore, the lag δ is given by

lv
δ= (1.27)
c2
The time dilation of moving clocks can be made more physical by considering
a clock which consists of a beam of light bouncing back and forth between two
mirrors kept at a distance l apart along the y′-direction (Fig. 1.3). In frame F′,
each round trip takes a time

2l
∆t ′ = (1.28)
c
Viewed from frame F, the beam travels a longer distance along a line
v
making an angle θ with the y-axis, given by sin θ = . Therefore the
c
corresponding time observed for each trip is

2l
∆t = 1/ 2 (1.29)
 v2 
c 1 − 2 
 c 

Combining Eqs. (1.28) and (1.29), one has

∆t ′
∆t = 1/ 2 (1.30)
 v2 
1 − 2 
 c 
Special Theory of Relativity 11

which tells us that ∆t′ ≤ ∆t . Since ∆t′ is the time indicated by the clock at rest
in F′, this implies that the moving clocks run at a slower rate. This phenomenon
is observed for unstable particles which are found to live for a longer time when
they are moving (see Example 2, Sec. 1.13).

Frame F¢ Frame F

Fig. 1.3 A clock at rest in F′ viewed from F.

1.7 LENGTH CONTRACTION


Lorentz transformations lead to unfamiliar results for the measurements of lengths
along the direction of motion.
Consider a rod of length l0, at rest in frame F′ , parallel to the x′ -axis. For
measuring the length of this rod in frame F, the positions of the ends of the rod
are observed at the same time, i.e., t1 = t2. Then from Eq. (1.20),
x2 − x1
x2′ − x1′ = 1/ 2 (1.31)
 v2 
1 − 2 
 c 
However, since the rod is at rest in frame F′, x2′,– x1′ = l0 irrespective of
the times of measurements. Therefore the length l of the rod observed from
frame F is given by
1/ 2
 v2 
l = l0 1 − 2  (1.32)
 c 
The decrease in the observed length of a rod moving in a direction parallel
to itself is called the Fitzgerald-Lorentz contraction.
It should again be emphasized that the above observation is reciprocal and
an observer in frame F′ will find that a rod at rest in frame F, parallel to the
x-axis, is shortened by the same factor. Indeed the observer in frame F′ can
12 Elements of Modern Physics

even explain the observation (1.32) from frame F as follows. The observer in
frame F′ will argue that the scale used by the observer in frame F is shorter by
1/ 2
 v2 
a factor of 1 − 2  , and therefore the length observed by F should be
 c 
 v  2
l0  1 − 2  . He will also argue from Eq. (1.21), that t2–t1 = 0 corresponds to
 c 
v
t2′ = t1′ = − ( x2′ − x1′ )
c2
v (1.33)
= − 2 l0
c

vl0
Therefore, the leading end was measured at a time − earlier than the
c2
trailing end, giving rise to an additional contribution to the measurement of the
v 2l0
length of an amount . Including both of these corrections, the length of the
c2
rod should be

 v2  l v2
l ′ = l0 1 − 2  + 0 2 = l0 (1.34)
 c  c
in agreement with the measurement from frame F′ !

1.8 TRANSFORMATION OF VELOCITIES


In the preceding sections, the implications of the Lorentz transformations for
the measurement of position and time intervals in different frames were
considered. Here, the relation between velocities of a particle measured in
different inertial frames is obtained.

dr dr′
Let u = be the velocity of a particle in frame F and u′ = be the
dt dt ′
corresponding velocity in frame F′ which moves with velocity v. Lorentz
transformations (1.20) imply that

dx − v dt
dx′ = 1/ 2,
dy′ = dy , dz′ = dz (1.35)
 v2 
1 − 2 
 c 
Special Theory of Relativity 13

v
dt − dx
and dt ′ = c2 (1.36)
1/ 2,
 v2 
 1 − 2 
 c 
Dividing the position intervals by the time interval,
ux − v
u x′ =
vu
1 − 2x
c
1/ 2
 v2 
u y 1 − 2 
c 
u y′ =  (1.37)
vu x
1− 2
c

1/ 2
 v2 
u z 1 − 2 
c 
u z′ = 
vu x
1− 2
c
where the subscript describes the components of the velocities. These
expressions relate the velocities of a particle measured in different inertial frames.

As an application of the above formulae, consider the relation between u and

u′ . It follows from Eq. (1.37) that

(c 2 − v 2 )(u 2 − c 2 )
u ′2 − c 2 = 2 (1.38)
2 vux 
c 1 − 2 
 c 
For v2 < c2, which is required for the Lorentz transformations to be physically
meaningful, the following important results are obtained:
u′ < c if u < c (1.39)
u′ = c if u = c (1.40)
u′ > c if u > c (1.41)
The first result implies that the relativistic addition of velocities with the
speed of each being less than c will again give a velocity with speed less than c.
14 Elements of Modern Physics

The statement in Eq. (1.40) is the reappearance of the assumption that the
speed of light is the same in all inertial frames of reference. Eq. (1.41) has been
considered for particles with speeds greater than the speed of light, known as
tachyons. It is interesting to note that for tachyons, the equation uxv = c2, is
possible in which case u′ tends to infinity. Observation of tachyons would be of
great interest since it would imply that information can be transmitted at a speed
greater than the speed of light. However, so far, tachyons have not been observed
experimentally.

1.9 LORENTZ FOUR-VECTORS


The postulates of the special theory of relativity require that the physical laws
retain the same form under Lorentz transformations. This requirement is satisfied
if the laws are stated as equalities between terms which have similar
transformation properties. For ordinary three-dimensional transformations, these
terms are three-dimensional vectors or their generalizations. However, the
Lorentz transformations mix the space and time coordinates so that we need to
generalize our ideas to four-dimensional vectors.
A relativistic four-vector may be defined to be a set of four quantities
Aµ (µ = 1, 2, 3, 0) ≡ (Ax, Ay, Az, A0) which transform like the space-time
coordinates xµ = (x, y, z, ct) under Lorentz transformations (the factor c is
included to give the same dimension to all the components), i.e.,

1  v 
Ax′ =  Ax − c A0  (1.42)
2 1/ 2  
 v 
1 − 2 
 c 
Ay′ = Ay
Az′ = Az

1  v 
A0′ = 1/ 2  A0 − c Ax 
 v 2  
1 − 2 
 c 
The scalar product between two Lorentz vectors, A µ = (A, A 0) and
Bµ = (B, B0) can be defined as
A⋅B ≡ A0B0 – A⋅B (1.43)
which can easily be shown to be equal to A′·B′, and hence is called a Lorentz
scalar. In particular, the ‘length’ of a vector is defined (A·A)1/2 which has the
same value in all inertial frames:
(A⋅A)1/2 = (A02 – A⋅A)1/2 (1.44)
Special Theory of Relativity 15

Here, unlike the three-dimensional case, the length of the vector may be
real or imaginary depending on whether (A02 – A·A) ≥ 0 or (A02 – A·A) < 0,
respectively.
Two examples of four-vectors are given which are of particular
importance. Consider two events characterized by the vectors xµ = (r, ct),
xµ* = (r*, ct*). The separation between these two events,
xµ* – xµ = (r* – r, c (t* – t)) (1.45)
is a 4-vector, and the interval between the two events is defined to be ∆τ where

1
(∆τ) 2 = (t * −t )2 − (r * − r ) ⋅ (r * − r ) (1.46)
c2
This interval ∆τ is a scalar invariant and is called the proper time interval
between the two events. The proper time interval is said to be timelike if
(∆τ)2 > 0, spacelike if (∆τ) < 0 and lightlike if (∆τ)2 = 0. Since (r* – r, c (t*– t)
transforms as a 4-vector, it may be observed that for a timelike interval, there is
an inertial frame in which the events occur at the same place. This is the frame
which moves with velocity v = (r* – r)/(t* – t) with respect to the given frame
(|v|) < c since |r*–r| < c |t*–t|. On the other hand, for a spacelike interval, the
events occur at the same time in the frame which moves with velocity v = c2
|r*–r| |t*–t|/|r*–r|2 with respect to the given frame (|v| < c since |r*–r| > c
|t*–t|. Finally, for a lightlike interval, a light pulse starting at (r, ct) would just
reach (r*, ct*).
As a second example of a Lorentz four-vector, consider the set of operators
 ∂ ∂ ∂ 1 ∂
− , − , − ,  . The transformations for these operators can be
 ∂x ∂y ∂z c ∂t 
obtained from Eq. (1.21) by using the chain rule and are

 ∂  1  ∂  v  1 ∂  
− ′  =  − ∂x  − c  c ∂t  
 ∂x   v 2 
1/ 2
   
1 − 2 
 c 
 ∂   ∂ 
− ′ = − 
 ∂y   ∂y 

 ∂   ∂ 
 − ∂z′  =  − ∂z  (1.47)
   
16 Elements of Modern Physics

1 ∂  1  1 ∂  v  ∂  
 = 1/ 2  c ∂t  − c  − ∂x  
 c ∂t   v 2     
1 − 2 
 c 
Comparing these with Eq. (1.42), it is seen that

 ∂ ∂ 1 1 ∂
− , − , − ,  (1.48)
 ∂x ∂y ∂z c ∂t 
is a 4-vector operator, i.e. it is an operator which transforms like a four-vector.
The relations (1.47) further imply that the negatives of the scalar product of the
operator with itself,

∂ ∂2 ∂2 1 ∂2
≡ 2
+ 2 + 2 − 2 2 (1.49)
∂x ∂y ∂z c ∂t
is a scalar operator. This operator, which is invariant under Lorentz
transformations, is called the d′Alembertian operator.

1.10. ENERGY-MOMENTUM FOUR-VECTOR AND


RELATIVISTIC DYNAMICS
It may be appreciate that the transformations (1.37) for velocities are involved
because the derivatives of position are taken with respect to time t which is not
an invariant scalar but the fourth component of a four vector. On the other hand,
the derivative of (r, ct) with respect to the proper time τ which is a Lorentz
scalar, will again give us a 4-vector. It should also be noted that the proper time
for the motion of a particle has the physical significance of being the time
indicated by a clock moving along with the particle. This can be deduced from
Eq. (1.46) by using the fact that ∆τ is a scalar invariant, and that the space
displacement ∆r for a particle is zero in the frame moving with the particle, so
that ∆t in this frame is equal to ∆τ.
Consider a particle which is at position r at time t and undergoes a change
in position of ∆r in time ∆t. We then define

 ∆r c∆t 
(p, p0 ) = m0 lim  ,  (1.50)
∆τ→0  ∆τ ∆τ 

where m 0 is the mass of the particle at rest and ∆τ is obtained from


Eq. (1.46) as.
1/ 2
 1  ∆r  
2
∆τ = ∆t 1 − 2    (1.51)
 c  ∆t  
Special Theory of Relativity 17

In terms of the velocity u of the particle,

m0u
p= 1/ 2
 u2 
1 − 2 
 c 
m0c
p0 =
 u2 
1/ 2 (1.52)
 1 − 2 
 c 

Since ∆τ is an invariant scalar, it follows that pµ = (p, p0) is a 4-vector and


its transformation relations are
v
px −p0
p x′ = c (a)
1/ 2
 v2 
1 − 2 
 c 

p y ′ = p y ; p z ′ = p z (b ) (1.53)

v
p0 − px
p0′ = c (c )
1/ 2
 v2 
1 − 2 
 c 
The vector p is the relativistic generalization of the Newtonian momentum
vector m0u. For the interpretation of p0, it is noted that for u << c,
1 3 u4
cp0 = m0c 2 + m0u 2 + m0 2 + ... (1.54)
2 8 c
where the second term is the Newtonian kinetic energy. Therefore cp0 may be
defined as the energy of the particle, m0c2 being the rest energy and the remaining
terms being the relativistic generalization of the Newtonian kinetic energy. If T
denotes the kinetic energy, then
T = cp0 – m0c2 (1.55)
The rest energy does not play a significant role if there is no change of mass
in a process, but becomes important if there is a change of mass. Finally, it is
noted that the scalar product p.p is
p02 – p.p = m02 c2 (1.56)
an invariant scalar as expected.
18 Elements of Modern Physics

The equation of motion for a particle may be written as

dp
=f (1.57)
dt
where f is the force on the particle. This equation is similar to Newton’s equation
of motion, with the important difference that the momentum is now given by the
dp0
relativistic expression (1.52). An equation for can be deduced using
dt
Eq. (1.56) to give
dp0 p
= .f
dt p0

u
= .f (1.58)
c
On multiplying by c, this equation just relates the rate of change of energy to
the rate of work done. It should be appreciated that since t is not a scalar, the
transformation properties of f are rather involved. It is however straight forward
dp
to obtain the transformation relations for f from those of , giving
dt

1 p0  v p 
f x′ =  fx − .f 
2 1/ 2
 v  p0′  c p0 
1 − 2 
 c 

p0 p
f y′ = f y ; f z′ = 0 f z (1.59)
p0′ p0′
It also follows directly from Eqs. (1.57) and (1.58), that if f = 0, both energy
and momentum of the particle are constants of motion.

1.11 ELECTROMAGNETIC INTERACTION


Though the theory of electromagnetic fields was formulated before the advent
of the special theory of relativity, it is form-invariant under Lorentz
transformations. In particular, it is consistent with the speed of propagation of
electromagnetic radiation being the same in all inertial frames. The transformations
of the electromagnetic fields and their interaction with matter are briefly described
here.
Special Theory of Relativity 19

Electromagnetic fields are described by Maxwell’s equations. These


equations in rationalised mks units, outside material media, are:
ρ 
∇⋅E = (a) 
ε0 
∇⋅B =0 (b) 

∂B 
∇×E + =0 (c)  (1.60)
∂t 

1 ∂E 
∇×B − 2 = µ0 J (d) 
c ∂t
where E is the electric field, B is the magnetic field, ρ is the charge density, J is
the current density, ε 0 is the capacitivity of vacuum and µ 0 is the
permeability of vacuum. The motion of a charged particle in the presence of
electromagnetic fields is given by
dp
= q (E + u × B ) (1.61)
dt
where q is the charge of the particle, and the expression on the right hand side
is called the Lorentz force.
It is most convenient to describe the electromagnetic fields in terms of the
electromagnetic potentials A and φ. From Eq. (1.60b), it is seen that B is of the
form
B=∇×A (1.62)
Substitution of this relation in Eq. (1.60c) then implies that E can be written
in the form
∂A
E=− −∇ φ (1.63)
∂t
The remaining two Maxwell’s equations lead to
∂ ρ
∇2 φ + (∇ ⋅ A) = −
∂t ε0

1 ∂2A  1 ∂φ 
∇2 A −2 2
− ∇ ∇ ⋅ A + 2  = − µ0 J
c dt  c ∂t 
It may be observed that Eqs. (1.62) and (1.63) do not determined A and φ
uniquely. A transformation
A→A+∇Λ (1.64)

φ→φ− Λ (1.65)
∂t
where Λ is a scalar function, does not alter B and E. Therefore some subsidiary
conditions can be imposed on A. This is done by requiring that
20 Elements of Modern Physics

1 ∂φ
∆⋅A+ =0 (1.66)
c 2 ∂t
a condition known as the Lorentz condition. Then Eqs. (1.64) simplify to:

 2 1 ∂2  ρ
∇ − 2 2  φ = −
 c ∂t  ε 0

 2 1 ∂2  (1.67)
 ∇ − 2 2  A = − µ0J
 c ∂t 

It is in this form that the Lorentz transformations of Maxwell’s equations


are most transparent.
It is noted that (J, cρ) transforms as a four-vector. To see this, consider a
charge at rest in frame F, and let its charge density in this frame be ρ0. The
charge of the particle is taken to be a universal, Lorentz invariant quantity, an
assumption which leads to a correct description of experimental observations.
Observed from a frame F′ in which the particle moves with velocity u, the
1/ 2
 u2 
particle dimensions appear contracted by the factor 1 − 2  in the direction
 c 
of u. Since the total charge is an invariant scalar, the charge and current densities
in frame F′ are
ρ0
ρ′ = 1/ 2
 u2  (1.68)
1 − 2 
 c 
ρ0 v
J= 1/ 2
 u2 
1 − 2 
 c 

Thus, (J, cρ) is proportional to the energy-momentum 4-vector of the


particle,
Jµ = (J, cp)
ρ
= 0 (p, p0 ) (1.69)
m0
 1 ∂2 
and hence transforms as a 4-vector. Furthermore, since  ∇ 2 − 2 2  was
 c ∂t 
shown to be a scalar operator (See sec. 1.9), Eq. (1.67) gives the result that
Special Theory of Relativity 21

 1 
Aµ =  A, φ  (1.70)
 c 
also transforms as a Lorentz 4-vector. Hence, Maxwell’s equations are seen to
be consistent with the special theory of relativity.
To show the form invariance of Eq. (1.61) for the motion of a charged
particle, the expression for the derivative with respect to the proper time τ is
written using Eq. (1.51). Substituting expressions (1.62) and (1.63) for B and E
in the expression for the electromagnetic force, and simplifying, gives
dp  1  φ  dA 
= −q  ∇  p0 − p ⋅ A  +  (1.71)
dτ  m0  c  dτ 
The quantity inside the parentheses is a scalar product between the
4-vectors pµ and Aµ, while ∇ is the space part of the 4-vector operator (1.48).
Therefore, both the right hand side and the left hand side of Eq. (1.71)
transform as space components of 4-vectors. The corresponding equation for
p0, making use of Eqs. (1.51), (1.56) and (1.61), is
dp0  1 ∂  φ  d  φ 
=−q   p0 − p . A  +   (1.72)
dτ  m0 c∂τ  c  d τ  c 
where the partial derivative applies only to φ and A; and not to p0 or p. It is now
clear that the left hand sides of Eqs. (1.71) and (1.72), as also the right hand
sides, transform as 4-vectors and hence the equation of motion of a charged
particle in the presence of electromagnetic fields, is form-invariant under Lorentz
transformations.

1.12 ZERO-MASS PARTICLES AND DOPPLER SHIFT


It follows from Eq. (1.52) that a particle whose mass is zero, must move with
velocity c if its energy and momentum are to be non-zero. It is believed that the
following particles have zero mass, and velocity c: the photon designated by γ,
which is the quantum of radiation, the neutrinos designated by ν, and the
gravitation which is the proposed quantum of gravitational wave. The energy
and momentum of these zero-mass particles, and their properties under Lorentz
transformations will be considered here.
It was argued by Einstein (See Sec. 2.2 for details) that a radiation of
frequency v consists of photons, each carrying a quantum of energy.
E = p0c
= hv, planck’s constant h = 6.67 × 10-34 J s (1.73)
22 Elements of Modern Physics

Since photons are supposed to have zero mass, one has from Eq. (1.56)

hv
p= (1.74)
c
Substituting these expressions in Eq. (1.53c) for the transformation of p0,
an expression for the frequency of radiation observed from a moving frame is
obtained as:

 v 
 1 − c cos α 
v′ = v  , (1.75)
2 1/ 2
 v 
1 − 2 
 c 

hv
where px = p cos α = cos α, α being the angle between p and the
c
x-axis. This formula is the exact expression for Doppler effect. Similarly, using
Eqs. (1.53a) and (1.53c), the ratio px′/p0′ is obtained as
v
cos α −
cos α′ = c (1.76)
v
1− cos α
c
This equation relates the directions of propagation in the two frames. In
particular, it gives us the relativistic aberration of starlight reaching us. “Unlike
the classical Doppler effect, it is observed that the relativistic Doppler shift for
radiation depends only on the relative velocity between the source and the
observer.” For observing the relativistic correction, one may consider
transverse Doppler shift for which cos α′ = 0, i.e. the observer is moving in a
direction orthogonal to the direction of propagation. In this case, the transverse
Doppler shift is (cos α = v/c)
 1 v2 
v′ ≈ v 1 − 2  for v << c (1.77)
 2c 
in contrast to the classical result of v′ = v. The small second order change in the
 ∆v 
frequency  ≈ 5.6 × 10−16 for v ≈ 10 m/s  has been observed (1960) by using
 v 
Mössbauer effect.
In Mössbauer effect, there is recoil-free emission and absorption of photons
by atoms embedded in crystals low temperatures (low temperatures are
required so that energy is not carried away by the lattice vibrations). The
Special Theory of Relativity 23

frequencies of the radiation of fairly well-defined except for the uncertainty due
to the natural lifetime τ of the excited atom. The radiation has a frequency
distribution

ρ (v ) ~ 1
(1.78)
4 h 2 (ν − ν 0 ) 2
1+
I ′2
where ν0 is the central value, and Γ′ is the uncertainty in the energy related to
the lifetime of the excited state by
τΓ =  (1.79)
h
= , h = 6.67 × 10−34 J s being the Planck’s constant. In an experiment

performed by Hay et al. (1960), photons are emitted by excited 57Fe atoms
embedded in the crystal, which have energy centred around hv0 = 14.4 keV and
a linewidth Γ = 4.7 × 10–9 eV. The emitter is placed at the centre of a centrifuge.
The photons are observed by 57Fe atoms in the ground state, embedded in a
crystal and kept at the edege of the centrifuge. When the centrifuge is not
rotating, the photons from the emitter are absorbed by the absorber, since the
photons have just the right energy for exciting the 57Fe atoms. However, once
the centrifuge starts rotating, the absorber sees the photons with shifted frequency
given by Eq. (1.77) (the speed is ν = ω r, ω being the angular speed and r being
the radial distance of the absorber from the centre) and the rate of absorption
goes down. The experimental observations for the shifts agree with the shifts
given by Eq. (1.77) within experimental errors, thus confirming the predictions
of the special theory of relativity for the transverse Doppler shift.

1.13 EXAMPLES
A few examples are now discussed to illustrate some applications, and elaborate
the ideas that have been analysed.

Example 1
This example shows that Galilean transformations are not consistent with
Maxwell’s equations.
Consider an infinitely long, stationary line-charge and a positively charged
particle P with charge q, moving away from the line charge with velocity u. The
only force acting on P is the repulsive force due to the electric field, and it acts
in a direction prependicular to the line charge. Now, an observer in a frame
moving parallel to the line charge with velocity v sees both electric and magnetic
24 Elements of Modern Physics

fields. The electric field gives rise to a force, again prependicular to the line
charge. However, the magnetic field B which is prependicular to u and v gives
rise to a force q (u–v) × B which has a component parallel to the line charge.
This contradicts the result of Galilean transformations that force is invariant.

Example 2
The time-dilation of a moving clock has a dramatic manifestation in terms of the
increased lifetime of a moving particle.
In nature, unstable particles are found, whose decay is described in quantum
mechanics as a transition from the initial state to the final state. The rate of
decay is determined by the transition probability which is defined by λ,
|dN (t)| = – λ N (t) dt (1.80)
where N (t) is the number of particles at time t, and dN (t) is the number of
particles which decay in time dt. The number of particles remaining at time t is
obtained from Eq. (1.80) to be
N (t) = N (0) e-λt (1.81)
The mean lifetime of the particle is then given by

dN

τ0 = t
N (0)
= 1/ λ (1.82)

Now, if the unstable particles are moving, their dilated lifetime is given

τ0
τ= 1/ 2 (1.83)
 v2 
1 − 2 
 c 
where v is the speed of the particles, i.e., the particles live for a longer time.
The dilation of lifetime of moving particles has important implications in the
design of experiments. Consider for example, the production of K+-mesons by
fast-moving protons colliding with a target. For a beam of K+-mesons of
momentum 3 GeV/c corresponding to v ≈ 0.98645 c, the bubble chamber where
the K+-particles will interact with protons, is kept at a distance of 100 m. Since
τ0 for K+-mesons is 1.23 × 10–8 s, the value of τ is 7.5 × 10–8 s so that the
fraction of K+-mesons reaching the chamber at t = d/v, is
N (t )
= 1.1 × 10−2 (1.84)
N (0)
Without the time-dilation, the fraction would have been about 1.12 × 10-12,
so that with a typical pulse carrying about 103 K+ -mesons the experiment would
Special Theory of Relativity 25

not have been feasible. Therefore, it is time-dilation which makes the experiment
feasible.

Example 3
It should be emphasized that it is only the speed of light in vacuum that is invariant.
In particular, the speed of light in water is not invariant with respect to different
inertial observers, as was shown by Fizeau (1851).
Consider the passage of light through a tube of length l, containing water at
rest. The speed of light in water is c/n where n is the refractive index of water.
If the water now flows with velocity v parallel to the direction of propagation of
light, the observed speed of light can be obtained from Eq. (1.37):
c
+v
u= n (1.85)
v
1+
cn

c
which is different from . This causes a change in the time of passage,
n

 v 
ln  1 + 
ln  cn 
∆t = − (1.86)
c  vn 
c 1 + 
 c 

lv 2

(n − 1)
c2
and the corresponding phase shift is
∆φ ≈ 2π lv v0 (n2–1)/c2 (1.87)
where v0 is the frequency of the beam of light.
In the experiment of Fizeau, two parts of a beam traverse the water tube in
opposite directions (the set-up is similar to the Michelson-Morley experiment),
and interface to produce a fringe shift
∆N ≈ 2 lv v0 (n2–1)/c2 (1.88)
This result agrees with experimental observations. It is worth noting that for
the classical case, the denominator in Eq. (1.85) is 1 and the corresponding
fringe shift is given by Eq. (1.88) but with n2 –1 replaced by n2.

Example 4
Whenever energy is extracted from a reaction, chemical or nuclear, it is at the
expense of the rest-mass energy. For chemical reactions, the change in the rest
26 Elements of Modern Physics

mass is no small that it is not possible to observe it directly. On the other hand,
for nuclear fusion or fission reactions, the change in the mass is quite substantial
and can be measured directly.
In the fusion of hydrogen nuclei into helium nuclei, which is the basic reaction
in stars, four protons combine either through the proton-proton cycle or the
carbon cycle (see Sec. 9.8) to yield a helium nucleus and two positrons.

4p → He4 + 2e+ (1.89)


The change in the mass for this process can be obtained from
m(p) = 1.007277 µ, m (He 4 ) = 4.001506 µ, m (e + ) = 0.000549 µ
(1 µ = 1.6604 × 10-27 kg), so that energy available is

(∆ m) c2 = 24.7 MeV (1.90)


The sun today is at a fairly stable stage of its evolution, called the main
sequence period. Assuming that the sun will remain on the main sequence till
about 10% of the hydrogen is burnt, since its mass is about 2 × 1030 kg and it
contains about 75% of hydrogen by mass, the total energy to be released while
it is on the main sequence is about 8.9 × 1043 J. The energy emitted by the sun
is about 3.9 × 1026 J/s, so that its lifetime on the main sequence is approximately
7 × 109 years. This should be compared with the more detailed estimations of
1.1 × 1010 years for the main-sequence lifetime of the sun, of which it has
already spent about 4.5 × 109 years.

Example 5
The transformation properties of the electric field due to a moving charge can in
 1 
principle be obtained from Eqs. (1.62) and (1.63) where  A, φ  transforms
 c 
as a four-vector. However, it is simpler to deduce them from the expression for
the Lorentz force in Eq. (1.61) and the transformation of force given in
Eq. (1.59).
Consider the field due to a charge q at the origin, moving with velocity u in
the x-direction. Then, the force on a unit charge at rest at P is
f=E (1.91)
However, in a frame F′ which moves with velocity u, the charge is at rest
and therefore, the force on the unit charge is
f′ = E′
Special Theory of Relativity 27

q  r′ 
=  ′3  (1.92)
4πε0 r 
where r′ is the distance between P and the point charge. The two forces are
related by Eq. (1.59), so that
Ex′ = Ex
1/ 2
 u2 
E y′ =  1 − 2  Ey (1.93)
 c 
1/ 2
 u2 
E z′ =  1 − 2  Ez
 c 

Using the expression for E′ given in Eq. (1.92),

q x′
Ex =
4πε0 r ′3

q y′
Ey =
4πε0  u2 
1/ 2

r ′3 1 − 2 
 c 

q z′
Ez = (1.94)
4πε0 1/ 2
3 u2 
r ′ 1 − 2 
 c 

Finally, since x′ = x/(1 – u2/c2)1/2, y′ = y, z′ = z, some simplification gives

 u2 
qr 1 − 2 
E=  c 
3/ 2
3 u 2 sin 2 θ  (1.95)
4πε0 r 1 − 
 c2 

where θ is the angle which the line joining the point P and the charge makes
with the direction of the velocity of the charge. The field is seen to be weaker
for small angles and angles near π , and has the largest value for θ = π /2.
28 Elements of Modern Physics

PROBLEMS
1. Show the two successive parallel Lorentz transformations are equivalent
to a single Lorentz transformation.
2. A rod of length l0 in its rest frame, moves with velocity v parallel to itself.
Obtain the Lorentz contraction of the rod by calculating the time taken by
the rod to pass a point (use time dilation) and then multiplying this
time by v.
3. Obtain an expression for time dilation considering a clock with light bouncing
back and forth along the direction of relative velocity, and using the concept
of Lorentz contraction.
4. What is the visually observed rate of a clock which moves with velocity v
along the line of vision?
5. The incoming primary cosmic rays (mostly protons) create µ-mesons in
the upper atmosphere. The lifetime of µ-mesons at rest is 2.15 × 10-6 s. If
the mean speed of the meson is 0.998 c, what fraction of the µ-mesons
created at a height of 20 km reach the sea level? What is the mean distance
travelled by the mesons before they decay?
6. A rod AB parallel to the x-axis, moves along the y-axis with velocity u.
Show that in a frame F′ which moves with velocity v along the x-direction,
this rod is inclined to the x′-axis at an angle
uv
tan −1 1/ 2
.
 v2 
c2 1 − 2 
 c 
7. A ρ-meson of mass 760 MeV/c2 decays at rest into two π-mesons of
mass 140 MeV/c2 each. What is the relative velocity of the π-mesons
with respect to each other?
8. Show that when force f is not parallel to velocity u, the acceleration is in
general not parallel to either the force or the velocity.
9. A particle of mass M decays at rest into a particle of mass m and a
photon. What is the energy of the photon emitted? Apply this to
(a) Σ+ (1189.4 MeV) → p (938.3 MeV) + γ, (b) H (2p) → H(1s) + γ, the
binding energy being 10.2 eV and 13.6 eV respectively.
10. A charged particle emits radiation when subjected to an external field.
This is known as bremsstrahlung. Show that energy-momentum
conservation does not allow a particle in isolation (no external forces) to
radiate. The argument is very simple in the centre of mass frame.
Special Theory of Relativity 29

11. A proton gains an energy of 1 electron volt (eV) or 1.6 × 10–19 J when it
traverses a potential difference of 1 V. If the proton has a mass of
1.67 × 10–27 k.g., what is the velocity of the proton which starts from rest
and traverses a potential difference of 109 V? What is the velocity of a
proton which comes out of the CERN super proton synchrotron with an
energy of 270 GeV (1 GeV is equal to 109 eV)?
12. A star is observed at the zenith taken to be the z-direction. If the star is
moving in the x-direction, and its radiation shows a redshift of 0.003 Å for
the Hα Balmer line (λ = 6563Å), what is its velocity with respect to us?
If the star were moving towards us with the same speed, what would be
the observed wavelength for the Hα line?
13. A charged particle can move in a material medium with a velocity greater
than the velocity of light in that medium. It polarizes the nearby atoms
which then emit radiation known as Cerenkov radiation. The envelope of
the spherical waves is a cone with the vertex at the charged particle and
whose surface makes an angle θ with the direction of motion of the particle.
Show that sin θ = c/nv, where v is the speed of the particle and n is the
refractive index of the medium (v ≥ c/n).
2
Introduction to Quantum Ideas

Structures of the Chapter


2.1 Black-body radiation
2.2 Photoelectric effect
2.3 Compton effect
2.4 Wave nature of particles
2.5 Atomic spectra
2.6 Nuclear model of the atom
2.7 Bohr model
2.8 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 31


S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_2
32 Elements of Modern Physics

In this chapter, the background which necessitated the introduction of quantum


theory of matter and radiation is discussed. We describe Planck’s theory of
black-body radiation, photoelectric effect, Compton scattering, matter waves
and Davisson-Germer experiment, Rutherford scattering, and the Bohr theory
of an atom. The choice of these topics is dictated by their historical importance
and the directness with which they lead to the basic rules of quantum mechanics.
Though most of these primitive quantum ideas have been replaced by the more
universal description in terms of wave mechanics, they still serve a useful purpose
in providing a simple picture of quantum phenomena.
Quantum mechanical effects become important in the domain of small
distances. To be more precise, the effects are important in measurements which
require the knowledge of, say, the momentum px and position x of a system to an
accuracy such that
(∆ px) (∆ x) ~ h (2.1)
where ∆ px and ∆ x are the errors in the measurement of the x-component of
momentum and position of the system, and h is a small number whose value in
mks unit is
h = 6.67 × 10–34 Js (2.2)
It is worth pointing out that once again radiation plays an important role in
the development of quantum mechanics, through for a different reason: the
ideas of wave functions and wave equations already existed for radiation; they
only required a reinterpretation in terms of the photon which is the quantum of
radiation.

2.1 BLACK-BODY RADIATION


Historically, the first indication of the inadequacy of classical ideas to explain
the properties of matter, occurred in what is termed as black-body radiation.
A black-body is a body which absorbs all the radiation incident on it, and hence
is the perfect absorber. Consideration of equilibrium of different bodies at the
same temperature implies that it is also the best emitter of radiation energy. A
black-body may be idealized by a small hole drilled in a cavity.
If the radiation from a black-body is analysed by a spectrometer
(i.e. a prism of a grating), it is found (Lummer and Pringsheim, 1900) that the
intensity distribution as a function of wavelength, has a well-defined shape
(Fig. 2.1). What is most significant is that, for a given temperature, it is a universal
curve independent of the properties of the walls of the cavity. In particular, it
has a maximum at some wavelength λm. As the temperature of the black-body
is raised, the intensity of radiation increases at each wavelength, and λm shifts to
a smaller value such that
Introduction to Quantum Ideas 33

Fig. 2.1 Intensity of radiation from a black-body, I (λ, T), in watts per square
centimetre per micron.

λmT = constant (2.3)


This result had been obtained by Wien (1893) from a theoretical analysis.
The initial approach to understand the nature of black-body radiation was in
terms of the standing waves set up in the enclosure and the thermodynamic
energy associated with these waves. Consider a plane wave in a cubic box of
length l. It may be written in the form
ψ = A sin 2π (v t ± kxx ± kyy ± kzz + α) (2.4)
1
where k = is the wave number, v is the frequency of the radiation and
λ
the ± signs define the direction in which the wave is travelling. Standing waves
are obtained by taking linear combinations of the different terms (with x = 0) in
Eq. (2.4). Requiring that the waves vanish at the walls passing through the
origin, we get
ψs = A cos (2π ν t) sin (2π kxx) sin (2π kyy) sin (2π kzz) (2.5)
where, in order that the wave should have nodes at the other walls,
nx n n
kx = , k y = y , kz = z (2.6)
2l 2l 2l
34 Elements of Modern Physics

with nx ny and nz being positive integers (negative integers give the same mode).
The wave number and therefore the frequency v = c | k | is obtained from these
relations as
c
c= (nx 2 + ny 2 + nz 2 )1/ 2 (2.7)
2l
Every possible set of positive integers (nx, ny, nz) gives a possible standing
wave, which may be depicted by a point in the 3-dimensional plot of (nx, ny, nz).
Since there is one such point per unit volume, the number of states is essentially
equal to the volume in this space (provided the volume is large). Therefore, the
number of stationary modes with frequency between 0 and v (which corresponds
to the volume in the first octant with n ≤ 2l v/c) is
3
 1   4π   2lv 
N (v ) = 2      
8 3  c 

8πl 3 v 3
= (2.8)
3c3
where a factor of 2 has been introduced to take into account the fact that for
each frequency v, there are two transverse modes of electromagnetic oscillations.
In the theory of statistical mechanics, the principle of equipartition states
1
that a mean energy of k T (k is the Boltzmann constant which has a value of
2
1.38 × 10–23 J K–1) is associated with each degree of freedom. For example, for
an ideal gas with molecules treated as geometric points, the mean energy of
3
each molecule is kT corresponding to its translational motion in three
2
independent directions. However, for a molecule in oscillatory motion.
corresponding to each mode of translational motion there is a potential energy
1
term which also contributes a mean energy of kT . Now, if a mean energy of
2
kT is assigned to each mode of electromagnetic oscillation, then according to
the principle of equipartition, the energy per unit volume, between frequencies
v and v + dv, is given by
dN (v)
u (v) dv = (kT ) (2.9)
l3
so that energy density per unit volume, per unit frequency is
8π v 2 kT
u (v ) = (2.10)
c3
Introduction to Quantum Ideas 35

This is known as Rayleigh-Jeans law and provides a good fit to the


experimental results in the region of long wavelengths. However, it is
unacceptable since it implies that the energy density increases indefinitely as
v increases and therefore the total energy per unit volume is infinite, contrary to
physical observations. This is known as the ultraviolet catastrophe.
Max Planck re-examined the basic assumptions in the theoretical approach
to black-body radiation. He introduced (1900) a new and revolutionary assumption
in the basic framework which allowed him to describe the experimental
observations with great accuracy. He associated each mode of electromagnetic
oscillation with atomic oscillators embedded in the walls of the cavity. However,
these oscillators were allowed to have only discrete energies which are integral
multiples of hv, and the energy distribution of these oscillators is according to
the Maxwell-Boltzmann distribution. Here h is a small number whose value is
given in Eq. (2.2), and is called Planck’s constant. Since the Maxwell-Boltzmann
distribution is given by exp (– E/kT), the average energy for each mode
oscillation is

∑ nh v exp (−nh v / kT )
n =0
ε= ∞

∑ exp (−nh v / kT )
n =0

hv
= (2.11)
exp (hv / kT ) − 1

Using this expression in Eq. (2.10) in place of kT, the following expression
is obtained

 8πh v3  1
u (v ) =   (2.12)
 c  exp (hv / kT ) − 1
3

It is easy to see that in the region of long wavelengths, this equation


reduces to the Rayleigh-Jeans relation in Eq. (2.10). It gives an exponential
damping in the short-wavelength limit. Overall, it is in excellent agreement
with experimental observations over a wide range of temperatures. It also
leads to many of the special relations for black-body radiation. For example,
the energy radiated by a unit area of a black-body, per unit time is given by the
Stefan-Boltzmann law:

c
U=
40 ∫
u (v ) d / v (2.13)
36 Elements of Modern Physics

(the factor of c/4 is discussed in Example 1) which, with the substitution


x = hv/kT and the use of Eq. (2.12), leads to

 2πk 4  x3dx
U = σT4 = 3 2 T4
h c 

0
ex −1 (2.14)

The value of the integral is known to be π4/15, so that Stefan’s constant is


given by

2 π5 k 4
σ= (2.15)
15h3 c 2
whose numerical value is in good agreement with the experimental value of
σ = 5.67 × 10–8 J/s m2 K4 (2.16)
Planck’s law in Eq. (2.12) can also be used to deduce the properties of λm
at which the radiation density is maximum. Nothing that u (λ) = u (v) dv/dλ.

 8π hc  1
u− (λ ) =  5 
 λ  exp (hc / λ kT ) − 1

 8πk 5T 5  x 5
= 4 4  x (2.17)
 h c  e −1
with x = hc/λkT. This function has a maximum at a wavelength given by the
condition d u (λ)/dλ = 0, and a numerical solution (see Example 2) gives
xm = hc/λmkT = 4.965 (2.18)
or λmT = 2.90 × 10-3 mK (2.19)
This relation, called Wien’s displacement law, is in very good agreement
with the experimental observations. It should be emphasized that Eq. (2.17)
implies

u− (λ )
= f (λ T ) (2.20)
T5
which had been deduced earlier by Wien (1893) from thermodynamic
considerations. It implies that a function of a single variable, namely of λT, gives
the complete description of u (λ) as a function of variables λ and T.
Planck’s hypothesis that the energy of the oscillators is quantized, is rather
on ad-hoc assumption, though it leads to Planck’s law of black-body radiation
which is an excellent agreement with the experimental observations. The law of
Introduction to Quantum Ideas 37

black-body radiation has found a more satisfactory description in terms of later


developments of quantum statistical mechanics. However, its historical
importance lies in the fact that it was for the first time here that the idea of
quantized energy was introduced and used for the description of physical
observations.

2.2 PHOTOELECTRIC EFFECT


The quantum hypothesis of Planck postulates that the energies of the atomic
oscillator are quantized and that they can change only by an integral multiple
or hv. This does not necessarily imply that the energy of the radiation in the
cavity is quantized. It suggests, however, that since the changes in the radiation
energy are due to absorption or emission of radiation by the atomic oscillators,
the energy of the radiation itself is quantized into multiples of hν. It is obvious
from the previous discussion that this quantized energy would also lead to Planck’s
law for the black-body radiation. This possibility received strong support from
Einstein’s explanation of photoelectric effect.
Hertz (1887) found that when a beam of ultraviolet radiation, for example,
from a mercury lamp, impinges on the surface of an alkali metal such as Cs, Rb,
K or Na, (which have a small work function), electrons are emitted. The number
of electrons emitted per second, and their energies can be studied by subjecting
the electrons emitted to an electric field as shown in Fig. 2.2 (a). The number of
electrons that escape from the cathode per second, and are collected by the
anode is given by i/e where i is the current, while the maximum kinetic energy
of the electrons emitted is given by eV0 where V0 is the stopping potential for
which the current reduces to zero [Fig. 2.2 (b)].

Fig. 2.2 (a) Schematic diagram of the equipment used for studying photoelectric
effect. (b) Typical photoelectric current against collector voltage.
38 Elements of Modern Physics

The following important points should be noted about the observations:


1. The electrons are emitted without any noticeable time delay (time lag is
less tan 10–8 s).
2. No electrons are emitted if the frequency of the incident radiation is less
than a critical value v0.
3. The maximum kinetic energy of the electrons is related to stopping potential
by

1
mvm 2 = eV0 (2.21)
2
It is independent of the intensity of incident radiation but is proportional to
v – v0, i.e.,
1
mvm 2 ∝ (v − v0 ) (2.22)
2
It is very difficult to reconcile the classical wave theory of light with these
observations. For example, with an incident radiation of intensity 10-10 J/m2 s, it
would require about 5 × 1011 s to absorb an energy of 3 eV by a cross-sectional
area of about 10-20 m2 presented by an atom. Actually, a more detailed analysis
shows that an atomic oscillator presents an effective area of about λ2 to light of
wavelength λ corresponding to its resonant frequency.
For radiation of λ = 10–7 m, this means an area of about 10–14 m2, which still
implies an accumulation time of about 5 × 105 s in contradiction with the
observation that there is no noticeable time delay in the emission of electrons,
Nor can the wave theory of radiation explain the existence of the sharp
threshold v0 for the emission of electrons, if the energy is absorbed continuously.
The fact that the maximum kinetic energy of the electrons emitted does not
depend on the intensity of incident radiation and that it is proportional to v – v0,
is equally puzzling.
A simple explanation of the various observations of the photoelectric effect,
was provided by Einstein (1905). Inspired by Planck’s work, Einstein proposed
that electromagnetic radiation itself is quantized into quanta of energy hv where h
is Planck’s constant. It is these quanta, called photons, that are absorbed as
single units by the electrons. If the energy hv of the photons is high enough, the
electrons are knocked out. From the law of conservation of energy, the maximum
kinetic energy of the electron is
1
mvm 2 = hv − eφ, for hv > eφ (2.23)
2
where eφ is the minimum energy with which the electron is bound within the
metal and is called the work function. This is the famous Einstein’s relation for
Introduction to Quantum Ideas 39

the photoelectric effect. It explains all the observed features of photoelectric


effect. It should be noted that
1. Since photons are absorbed as single units, there is a localization of energy
and hence there is no significant time delay in the emission of electrons.
2. Einstein’s relation in Eq. (2.23) implies the existence of a critical frequency
for the emission of electrons, given by


v0 = (2.24)
h
However, since the current increases gradually as V increases from –V0 ,
the effective binding of the electrons inside the metal varies as also the
velocity of the emitted electrons. Therefore, the critical frequency in
Eq. (2.24) refers to the emission of electrons with minimum binding energy.
3. The maximum kinetic energy of the emitted electrons (having minimum
binding energy) is given in terms of the critical frequency, by the relation

1
mvm 2 = h (v − v0 ) (2.25)
2
In terms of the stopping potential V0, one has
eV0 = h (v–v0) (2.26)
Thus, it not only explains all the experimental observations but also gives the
ratio of h/e from the slope of the linear plot of V0 against v. Using the known
value for the charge of the electron, an independent determination of Planck’s
constant, in good agreement with the value obtained from other considerations
such as the black-body radiation, can be obtained.
Some additional observations related to the photoelectric effect are:
1. Only a small fraction (about 5%) of the incident photons, succeeds in
ejecting photoelectrons while most of them are absorbed by the system as
a whole and generate thermal energy.
2. Photoelectric effect is also possible for isolated atoms in the form of a gas,
e.g. Na, K vapour, and the process is known as photoionization. It is
observed by passing a beam of ultraviolet radiation through a chamber
containing Na or K vapour, and collecting the electrons ejected by subjecting
them to an electric field. It is interesting to not that in photoionization, since
the atoms are isolated, there is no collective absorption of photons and
every photon absorbed succeeds in ejecting an electron. This can be verified
by comparing the number of photons absorbed as deduced from the
decrease in the intensity of the beam, with the number of electrons collected
by the electric field.
40 Elements of Modern Physics

3. The energy required for ejecting the electrons may also be provided by
heating the metal, which results in the thermionic emission of the
electrons. They allows us to calculate, from quantum statistical mechanics,
the work function eφ. The value obtained agrees with the one obtained
from the photoelectric effect.
4. So far, it has been assumed that an electron receives energy only from a
single photon, the process being called a single-photon process. The
development of lasers has provided light beams of very high intensity
which allow us to observe multi-photon processes, in particular the multi-
photon photoelectric effect. In this process, an electron ejected from a
metal receives energy from N photons. Its kinetic energy is given by

1
mvm 2 = Nhv − eφ (2.27)
2
and the critical frequency is eφ/Nh which is smaller than the corresponding
frequency for single-photon processes by a factor of 1/N.
In the analysis of the photoelectric effect, a photon was regarded as a
wave packet of energy, with no statements made for the quantization of
momentum. In fact since a significant amount of momentum was carried away
by the metal, conservation of momentum could not be usefully applied to the
photon-electron system. For the photon to acquire the bonafides of a particle, it
should have both a quantum of energy as well as a quantum of momentum. This
was demonstrated by the discovery of Compton effect (1922) in the scattering
of x-rays by electrons.

2.3 COMPTON EFFECT


Compton effect is essentially a demonstration of the scattering of a photon by
an electron as a particle-particle scattering.
Compton was concerned with the scattering of x-rays of wavelength λ0 by
a thin film of metal. He measured the wavelength distribution of the scattered
rays at different scattering angles and found that it had two major components
in wavelength. One had essentially the same wavelength as the incident radiation,
i.e. λ0, while the other had a slightly longer wavelength [Fig. 2.3 (a)], with the
separation between the two wavelengths given by λ–λ0 = λc (λ – cos θ). Here
λc is a constant which is independent of λ0 or the scattering angle and is called
the Compton wavelength of the electron. This increase in wavelength or the
decrease in frequency is very difficult to understand in terms of the wave
description of radiation. However, it can be explained quite accurately by
considering the process as a scattering of photons regarded as particles with
well-defined energy and momentum by the electrons in the metal.
Introduction to Quantum Ideas 41

hg

hn0 q
f
I

0 2.4
(l – l0) Å ´ 102
(a) (b)

Fig. 2.3 (a) The intensity of the scattered x-rays as a function of (λ – λ0) Å × 102, for
scattering angle θ = 90°. (b) Compton scattering as particle-particle scattering.

hv0
If the initial and final photons have energies hv0 and hv, and momenta n
c 0
hv
and n, respectively, where n0 and n are unit vectors in the directions of
c
propagation and the momentum of scattered electron is p, then from momentum
and energy conservation [Fig. 2.3 (b)],

hν 0 hν
= cos θ + p cos φ (2.28)
c c


0= sin θ − p sin φ (2.29)
c
hv0 + mc2 = hν + (p2c2 + m2c4)1/2 (2.30)
Eliminating φ from the first two equations,

h2
p2 = (ν 0 2 + ν 2 − 2ν ν 0 cos θ) (2.31)
c2
Also from Eq. (2.30)
h2
p2 = (ν 0 − ν + mc 2 / h) 2 − m 2c 2 (2.32)
c2
Equating the two expressions for p2 leads to
hν ν 0
ν0 − ν = (1 − cos θ) (2.33)
mc 2
42 Elements of Modern Physics

h
or λ − λ0 = (1 − cos θ) (2.34)
mc
This is Compton’s expression for the shift in the wavelength of the scattered
x-rays. It identifies the Compton wavelength as

h
λc = (2.35)
mc
which depends only on the mass of the scattering particle and has a value of
2.43 × 10–2 Å for the electron. It has an interesting interpretation, that a
photon with wavelength λc has an energy hν = mc2, i.e., the rest energy of
the particle.
In the discussion presented here, it is assumed that the target electron is
stationary and free. It is also valid if the electron is weakly bound to the
atom with the binding energy of a few eV which is quite small compared
with the energies of the x-ray photons, which are about 10 keV or greater.
However, it may so happen that the electron remains bound in the same
state to the atom, even after the collision with the photon (this is more likely
to happen if the electrons are strongly bound). In this case, the transfer of
energy and momentum is to the atom as a whole, so that the mass of the
atom must be used in place of the mass of the electron. Therefore, the
corresponding Compton wavelength is much smaller (at least by a factor of
1800) and the resulting Compton shift in the wavelength is negligible. This
explains the unshifted component in the spectrum of the scattered x-rays,
which is called the Thomson component.
Some interesting additional features of Compton effect are:
1. The fact that the shift in the wavelength of the radiation is indeed due to
the scattering of the radiation by the electron was confirmed by observing
the scattered electron (Bothe and Geiger, 1925).
2. The main reason for the spread in the wavelength of the Compton-shifted
x-rays is that the initial electron is in general not stationary but has a
momentum spread even inside the atom. The correction due to the binding
of the electron can be taken into account in terms of its momentum
distribution (see Example 4).
3. Though the sift λ – λ0 is independent of λ0, the intensity of scattering
depends on λ0. It actually increases as λ0 → 0 and hence the effect is
more easily observable for x-rays than for lower frequency radiation (indeed
this is the reason why the sky is blue).
4. The scattering angle φ is given by
Introduction to Quantum Ideas 43

sin θ
tan φ =
ν 0 / ν − cos θ

cot (θ / 2)
= (2.36)
1 + h ν 0 / mc 2
Since in cost cases hv0 is of the order of 10 keV, and therefore hv0 << mc2,
1
one has the simple relation φ ≈ (π − θ) or the direction in which the electron is
2
scattered bisects the angle which is supplementary to the angle made by the
final photon momentum with the initial photon momentum.

2.4 WAVE NATURE OF PARTICLES


The observations of photoelectric effect and Compton scattering firmly establish
the particle-like properties of radiation. One could also rephrase this wave-
particle duality in the following from: photons are particles with zero mass
(since E2 – p2c2 = 0 for photons) which have wave-like properties (e.g.
interference and diffraction). It was suggested by de Broglie (1923) that particles
with nonzero mass also possess wave-like properties. This was indeed a daring
proposal which went beyond the classical concepts of particles with nonzero
mass. Specifically, he proposed that material particles of momentum p are
associated with a wavelength

h
λ= (2.37)
p

called the de Broglie wavelength. This idea was an important step in the
development of wave mechanics.
The wave properties become easily noticeable only when the obstructing
bodies have dimensions comparable with the wavelength. For macroscopic bodies,
the de Broglie wavelength is negligibly small. For atomic systems, this wavelength
becomes more significant: for an electron with an energy of 100 eV the
de Broglie wavelength is about 1 Å, comparable with the wavelength of x-rays
as also with the size of an atom. Their wave properties may therefore be
observed in their scattering by crystals. This was confirmed experimentally by
Davisson and Germer (1927) who studied the scattering of electrons by a
monocrystal of nickel.
44 Elements of Modern Physics

Fig. 2.4 Bragg condition for constructive interference for


reflection from different planes.

Consider the scattering of a beam of electrons by a nickel crystal.


The intensity of the scattered electrons is measured at various angles, and for
different velocities of the incoming electrons. It was found that the scattering is
intense when the energy of the incoming electrons was 54 eV and at an angle
which was equal to the angle at which a strong reflection by the atomic planes
was observed for x-rays of wavelength 1.65 Å. If the x-rays are incident on a
set of atomic planes at distance d from each other, at an angle θ (Fig. 2.4), the
amplitudes of x-rays scattered by the atoms in a given plane will be coherent if
the angle of reflection is equal to the angle of incidence. Furthermore, the
amplitudes for scattering from different planes will be coherent if the path
difference for scattering from two successive planes is an integral multiple of λ,
i.e., the Bragg condition is satisfied:
2d sin θ = nλ (2.38)
where n is a positive integer. For x-rays diffracted by the nickel crystal,
maximum intensity occurs at λ = 1.65 Å (for n = 1). On the other hand, the
de Broglie wavelength for 54 eV electrons is 1.67 Å, and they too have an
intense scattering at the same angle. The ageement between the two
wavelengths is striking, indicating that the Bragg equation is satisfied by
electrons also, provided de Broglie wavelength is used for the wavelength of
the electrons. This should be regarded as a confirmation of de Broglie’s idea.
The essential point of many of the experiments demonstrating de Broglie’s
relation is that any experiment of x-ray diffraction, in principle can be simulated
by an experiment with electrons of corresponding de Broglie wavelength. In
particular, Thomson (1927) as also Tartakovsky, obtained a diffraction pattern
when an electron beam (of tens of keV) was passed through a thin foil of
polycrystalline material. The electrons pick out those crystals whose planes
are oriented so as to satisfy Eq. (2.38) and produce a diffraction pattern which
is similar to the diffraction pattern produced by x-rays of wavelength equal to
the de Broglie wavelength for the electrons.
Introduction to Quantum Ideas 45

The electron diffraction experiments establish the fact that the electrons
possess wave properties exactly as de Broglie had suggested. To clarify that
the wave property is not because of the simultaneous participation of a large
number of electrons, but is associated with each electron, experiments have
been done with very low intensity electron beams so that the electrons pass
through the instruments essentially one at a time. With sufficiently long exposure,
a diffraction pattern was obtained which differed in no way from the pattern
obtained with normal intensity beams, thus suggesting that the wave-like properties
are to be associated with individual electrons. The de Broglie wavelength has
also been verified for neutral molecules (Estermann and Stern, 1930) and for
neutrons; however, in order that they have the same de Broglie wavelength as
the x-ray wavelength, their energies (E = p2/2m) have to be of the order of
~ 0.02 eV, i.e., one uses thermal molecules and neutrons.
Electron and neutron diffraction, along with x-ray diffraction have become
an indispensable tool for the study of the structures of solids. Their diffraction
patterns though qualitatively similar, have important differences which need to
be mentioned.
Since the electrons are sensitive to electrostatic forces, electron beams
have little penetration. With the development of slow electron beams
(10 to 1000 eV) which undergo negligible penetration and for which diffraction
occurs essentially at the first atomic layer, electron diffraction has become an
important tool for the study of surfaces. Electron beams of fairly high energies,
i.e., about 50 keV, have quite small de Broglie wavelengths, and therefore are
used in electron microscopes for high resolution studies of small specimens.
On the other hand neutrons, like x-rays, are fairly insensitive to electrostatic
forces, and therefore penetrate easily. Neutron diffraction has several advantages:
1. Since neutron scattering is essentially by the nucleus and depends quite
distinctively on the structure of the nucleus, neutron diffraction can give
more information about crystals formed from different atoms which have
nearly equal atomic number and which cannot easily be distinguished by
x-rays.
2. The scattering of neutrons by light nuclei such as hydrogen is large, and
hence neutron diffraction is an important technique in the study of structures
of organic compounds. The scattering of x-rays by light atoms is weak
(the coherent cross-section is approximately proportional to Z2 where Z is
the number of electrons).
3. The magnetic ordering of atoms in a crystal, which have nonzero magnetic
moment, can be studied by neutron diffraction which therefore is a valuable
method of investigating magnetic materials. The main disadvantages of
using neutron diffraction are the low intensity of neutron beams available
46 Elements of Modern Physics

and the difficulty of detecting neutrons which are electrically neutral. They
are usually detected by the α particle emitted through their reaction with
boron nuclei.

2.5 ATOMIC SPECTRA


De Broglie’s idea allows us to discuss both radiation and particles such as
electrons, in a unified manner. They behave like particles in the sense that they
have discrete energy and momentum, and yet are diffracted as waves. In this
respect, there is no qualitative difference between photons which have vanishing
mass and particles such as electrons, neutron and proton which have nonvanishing
mass (there are subtle differences, however). Now, for radiation inside a box,
one has stationary waves with discrete frequencies, and the corresponding photons
have discrete energies. It might then be expected that particles such as electrons
also have discrete energies if they are confined to a finite volume. Such a situation
is simulated by an electron which is bound inside an atom, and the atomic spectra
do indicate that the electron indeed has discrete energies.

Fig. 2.5 Hydrogen spectrum in the visible and near ultraviolet region.

When a vapour glows, e.g. in a flame or due to current discharges, the


radiation emitted has well-defined discrete frequencies (unlike thermal emission
in a solid, which has a continuous spectrum). Observed from a spectrograph
(an instrument for analysing the frequency distribution, using a prism or a grating),
sharp lines with a narrow width are observed which are characteristic of the
vapour, and which can be used for identifying the components of the vapour.
It had been observed that these lines have some order. This is revealed for
example in the part of hydrogen spectrum in the visible and near ultraviolet
region (Fig. 2.5). Specifically, it was found by J. Balmer (1885) that the hydrogen
lines in the visible and near ultraviolet region could be expressed quite accurately
by the relation
n2
λn = λ∞ , n = 3, 4, ... (2.39)
n2 − 4
Introduction to Quantum Ideas 47

where λ∞ ~ 3646 Å is the limiting value shown in Fig. 2.5. A simple but important
step in the analysis of atomic spectra was taken by Rydberg (1890) who pointed
out that the sequence in Eq. (2.39) could be represented in a more suggestive
form in terms of the reciprocal of the wavelength called the wave number,
related to the frequency:

1  1 1 
= R  2 − 2  , n = 3, 4, ... (2.40)
λn 2 n 
R = 1.0972 × 107 m-1
where R is called the Rydberg constant. The spectral lines represented by
Eq. (2.40) from what is known as the Balmer series. Further investigations
showed that the hydrogen spectrum has other series, in the ultraviolet and infrared
regions. They are represented by formulae similar to Eq. (2.40), e.g. Lyman
series (1906) in the ultraviolet region by

1 1 1 
= R  2 − 2  , n = 2, 3, ... (2.41)
λn 1 n 
Paschen series (1908) in the infrared region by

1  1 1 
= R  2 − 2  , n = 4, 5, ... (2.42)
λn 3 n 
Brackett series (1922) in the infrared region by

1  1 1 
= R  2 − 2  , n = 5, 6, ... (2.43)
λn 4 n 
The frequencies of the lines in the hydrogen spectrum can be obtained from
a single formula

1  1 1 
= T1  2 − 2  , m < n (2.44)
λ m,n m n 
m = 1 giving the Lyman series, m = 2 giving the Balmer series, etc.
The spectra of other atoms also show some order, and their frequencies
can be represented by

1
= T1 (m) − T2 (n), m < n (2.45)
λ m,n
However, the form of T(n) is generally more complicated than for the
hydrogen atom, one of the most useful being R(n–d)–2 where δ is a constant
known as the quantum defect, δ << n.
48 Elements of Modern Physics

The discreteness of the spectral lines is a property of the absorption spectrum


as well. When radiation passes through a vapour, the vapour absorbs radiation
of discrete frequencies which correspond to the frequencies of the emission
spectrum, given by Eq. (2.45). This results in the appearance of dark lines
corresponding to the frequencies of the radiation absorbed, which coincide with
the positions of the spectral lines in the emission spectrum. In fact, helium was
first discovered by its absorption lines in the solar spectrum before it could be
identified terrestrially.
A photon associated with radiation of frequency ν carries in energy hν.
Therefore, since energy is conserved, it is plausible to conclude that the discrete
frequencies of the emission and absorption spectra, given by Eq. (2.45),
correspond to transitions of the atom between states characterized by the integers
m, n, which have discrete energies
En = – hcT (n) (2.46)
In particular, the energy levels of the hydrogen atom are given by

hcR
En ( H ) = − (2.47)
n2
The numerical value of hcR is 13.6 eV, and is called the ionization potential
of the hydrogen atom.
One of the first attempts to have a model of an atom, which can explain the
discrete spectrum of the atom, was due to J.J. Thomson (1903). It had been
recognized that the negatively charged electron is one of the fundamental
constituents of the atom. Since the atom as a whole is neutral, it should also
contain a positively-charged part, called positive ion to balance the negative
charge of the electron. It was also known that a great majority of the mass is
associated with the positive ion. This led Thomson to propose a model of the
atom in the form of a sphere of uniform positive charge, in which the small,
negatively charged electrons are embedded. The electrons perform simple
harmonic motion about their positions of equilibrium, which results in the emission
of radiation of characteristic frequencies. Quantitatively, the potential energy of
the electron inside the atom is

Ze 2
V (r ) = (r 2 − 3R02 ) (2.48)
8πε0 R03
where R0 is the radius of the atom and Z is the atomic number. This potential will
cause the electron to oscillate with a frequency
1/ 2
1  Ze 2 
ν=   (2.49)
2π  4πε0 mR03 
Introduction to Quantum Ideas 49

For Z = 1 and a frequency ν ≈ 5 × 1014 s-1 corresponding to a wavelength of


λ = 6000 Å, one gets R0 ≈ 3 Å which is comparable to the size of the atom and
therefore can be taken as a support for the Thomson model. A suitable
arrangement of the electrons in a series of rings, can also explain the existence
of homologue series such as Na, K, Rb, etc. which have similar properties.
However, the model is totally inconsistent with the experimental observations of
the scattering of a particles by thin metal foils, and had to be discarded. Now, it
is only of historical interest.

2.6 NUCLEAR MODEL OF THE ATOM


The distribution of charge and mass in an atom can be investigated by the
scattering of charged particles by the atom. Such an investigation was carried
out by Rutherford and his collaborators.
It was observed by Geiger and Marsden (1909) that in the scattering of
α particles (doubly ionised 4He) by the atoms in a thin metal foil, though most of
the α particles emerged without much deviation, some of them were scattered
through large angles (some close to 180 degrees). The number of such events
was at least 104 larger than the number expected from the multiple scatterings
by the Thomson atoms with uniform positive charge distribution. Nor can these
large deflections be caused by the electrons in the atoms, since their mass is
very small. From an analysis of the data, Rutherford came to the conclusion that
the large deflections are caused by the strong electric field associated with a
large mass concentrated in a small volume. On the basis of this conclusion,
Rutherford proposed (1911) a nuclear model of the atom, which is the first
authentic model of the modern understanding of the atom.
Rutherford’s model of the atom consists of a nucleus of positive charge
Ze (Z is the atomic number), which carries most of the mass of the atom and is
concentrated in a very small region of radius less than 10–14 m. Moving around
the nucleus are Z electrons at a distance from the nucleus roughly equal to the
size of the atom, i.e., 10–10 m. The explanation of the α-particle scattering is
that most of the α particles go through the atom with only a slight deflection
except when they encounter the strong field due to the massive nucleus in
which case they undergo a large deflection. The detailed predictions of the
calculations based on this model are in very good agreement with the experimental
observations.
Consider the scattering of an α particle by a thin metal foil. Let the cross
sectional area of the beam be A. Then the total number of atoms which scatter
the α particles is
n = ρ At (2.50)
where ρ is the number of atoms per unit volume, and t is the thickness of the
foil. For each atom, let the incoming α particles in the impact parameter range
50 Elements of Modern Physics

(impact parameter is the distance of the nucleus from the initial direction of
motion of the α particle) of b to b + db be scattered into angles between θ and
θ + dθ (Fig. 2.6). Then, the fraction of α particles scattered into angles between
θ and θ + dθ is

dN 2πbdb
=n
N A

db
= 2π ρtb dθ (2.51)

Fig. 2.6 Scattering of α particles by a nucleus of charge Ze.


For evaluating this expression, a relation between b and θ is needed, which
depends on the force experienced by the α particles. Specifically, the force due
to a charge Ze of the nucleus is considered. It is shown in Example 6, that for a
Coulomb potential 2Ze2/4πε0r, the relation between b and θ is

Ze 2 θ
b= 2
cot   (2.52)
2πε 0 mv 2
where m is the mass of α particle and v is its speed. Using this relation in
Eq. (2.51), the fraction of particles scattered into solid angle dΩ = 2π sin θ dθ,
is
2
dN  Ze 2  dΩ
= ρt  2 
θ (2.53)
N  4πε0 mv  sin 4  
2
This is the Rutherford formula for the scattering of α particles by nuclei of
change Ze.
The important features to note in this formula are that the fraction is
(i) proportional to the thickness t of the metal foil, (ii) proportional to Z2,
Introduction to Quantum Ideas 51

1 2
(iii) inversely proportional to T2 where T = mv is the kinetic energy of the
2
θ
incoming α particles and (iv) inversely porportional to sin 4 where θ is the
2
angle of scattering. These properties were tested by Geiger and Marsden by
varying the thickness and the composition of the foil, the energy of the incident
α particles and the angle of scattering, and were found to be inxcellent
agreement with the experimental observations. For examples, they found in
an experiment with silver foil, that dN was proportional to 111, 680 and 8800
for θ = 150°, 75° and 37.5°, respectively, other variables remaining the same.
For these values, the product (dN) sin4 (θ/2) is proportional to 96.6, 93.4, 93.9,
respectively. The near-constancy of the product, though dN itself varies by a
large factor, indicates the essential correctness of the θ-dependence of
scattering rates.
It may be noted that the Rutherford formula is the same for attractive and
repulsive Coulomb potentials. It is not valid for values of the impact parameter
b larger than interatomic distances for which an α particle cannot be regarded
as being scattered by a single atom in the metal foil. It is also not valid if the
α particle approaches the nucleus to a distance (for a head-on collision
rmin = Ze2/πε0mv2) less than the size of the nucleus, i.e., about 10–14 m, at which
the nuclear forces become important.
While the Rutherford model of the atom provides a fairly comprehensive
description of the scattering of low-energy α particles by the atoms, there are
some implications of the simple model which are in conflict with the classical
interpretations of experimental observations. The stability of the atom demands
that the electron must revolve around the nucleus. But such an electron, since it
is accelerating, must radiate energy continuously according to the classical theory
of electromagnetism, and ultimately coalesce with the nucleus. Experimentally,
an atom is a highly stable object. Furthermore, it can absorb radiation only of
some well-defined frequencies and then emit radiation again of well-defined
frequencies. The Rutherford model of the atom is unable to explain these
experimental observations. It is with the intention of reconciling the Rutherford
model with the observed stability and spectrum of the atom, that Bohr began his
search for a model of the atom and came up with what is known as the Bohr
model of the atom.

2.7 BOHR MODEL


Bohr (1913) started by assuming that Rutherford’s model of the atom is essentially
correct but that the classical laws of dynamics need to be modified so as to be
52 Elements of Modern Physics

applicable to the motion of an electron around the nucleus. The modifications


are introdued in the form of the following postulates:
1. Electrons that are bound to the nucleus can move around in only certain
discrete orbits. While in these orbits the electrons do not emit
electromagnetic radiation, though the motion is accelerated.
2. For circular motion, the allowed orbits are determined by the quantum
condition that the angular momentum is n  where  = h/2π, h being
Planck’s constant, and n can take positive integral values, n = 1, 2, ... .
3. Emission or absorption of radiation occurs only when an electron undergoes
transition from one allowed orbit to another. The frequency of the radiation
emitted or absorbed is given by the relations hν = Ei–Ef for emission and
hν = Ef – Ei for absorption Ef and Et being the energies associated with the
final and initial orbits.
These postulates are quite ad hoc but are justified by the impressive
agreement of the predictions with the experimental observations.
Consider an electron with mass me moving around a nucleus of mass mn and
charge Ze. Let the distance of the electron and the nucleus, from the centre of
mass be re and rn respectively and ω be the angular speed of circular motion. If
r is the inter-particle distance, r = re – rn, then in the centre of mass frame
mere + mnrn = 0 (2.54)
mn
so that re = r
mn + me

−mn
rn = r (2.55)
mn + me
It is seen that in the centre of mass frame, the electron and the nucleus are
on opposite sides of the centre of mass.
The total energy is
1 Ze 2
E= (me re 2 + mn rn 2 ) ω2 −
2 4πε 0 r
1 Ze
= mr r 2ω2 − (2.56)
2 5πε0 r
where mr = memn/(me + mn) (2.57)
is the reduced mass (it is only slightly smaller than the electron mass), and the
angular momentum is
L = (mere2 +mnrn2)ω
= mrr2 ω (2.58)
Introduction to Quantum Ideas 53

Now, the dynamical condition for circular orbits is

Ze 2
me re ω2 = mr r ω2 = (2.59)
4πε 0 r 2
while the quantum condition for the angular momentum is
L = mrr2 ω = n  (2.60)
Solving for r, Eqs. (2.59) and (2.60) give

4πε 0  2 n 2
r= (2.61)
mr Ze 2
This is the radius of the nth Bohr orbit. Furthermore, the equilibrium condition
in Eq. (2.59) allows us to write the total energy as

Ze 2
E=− (2.62)
8π ε 0 r
Using the value of r given in Eq. (2.61), the allowed energies of the atom
are

hcR
En = − , n = 1, 2, ... (2.63)
n2
2
where mr  Ze2  (2.64)
R=  
4πc3  4πε0 

Substituting the values of the constants, for the hydrogen atom


RH = 1.09678 × 107 m–1 (2.65)
which is very close to Rydberg’s original value in Eq. (2.40).
The atom can undergo transitions only between the orbits with the discrete
energies En given in Eq. (2.63). The frequency of the radiation emitted or absorbed
when the atom undergoes a transition from a state with energy En to a state
with energy Em, is given by
 1 1 
ν = cR  2 − 2  for emission, (2.66)
 m n 

 1 1 
ν = cR  2 − 2  for absorption
n m 
It is also implied that if the atom is in the ground state, i.e., the state with the
lowest energy, n = 1, it continues to remain in that state unless an external
54 Elements of Modern Physics

radiation of a suitable frequency is incident on it. If the final state has n = 1, 2, 3, 4


one gets Lyman, Balmer, Paschen, Brackett series, respectively (Fig. 2.7), of
the emission spectra and the corresponding absorption spectra for n = 1, 2, 3, 4
in the initial state. Thus, the Bohr model provides a quantitatively satisfactory
explanation of the spectrum of the hydrogen atom.

Fig. 2.7 Energy levels of the hydrogen atom, in the Bohr model.
The result in Eq. (2.63) can also be applied to other one-electron atoms
such as the deuterium, singly ionized helium atom He+ or doubly ionized lithium
atom Li++, etc. For the deuterium, only the reduced mass is slightly larger than
for the hydrogen atom and the energies (being negative) are slightly lower. The
existence of the corresponding lines can be used for the detection of the presence
of deuterium. For He+, Li++, etc. the energy levels differ by an additional factor
of Z2. However, since Z is an integer, some of their energy levels will be close to
those of the hydrogen atom, the small differences being due to the different
reduced masses. For example, the energy levels of He+ for even n, n = 2p, are
related to the hydrogen energy levels, by
Introduction to Quantum Ideas 55

mr (He + )
E2 p (He + ) = E p (H) (2.67)
mr (H)
The ratio of the reduced masses is approximately

mr (He + )  1 1 
≈ 1 + me  − +  (2.68)
mr (H)  m(H) m(He ) 
i.e., about 1.000408. Therefore, the transitions between these He+ levels
correspond to frequencies which are slightly higher than those of the hydrogen
atom. Indeed, measurements of these small differences give a fairly accurate
determination of the ration of me /m(H).
Bohr’s ideas can be extended to noncircular orbits also. This leads to the
conclusion (see Example 7) that the angular momentum does not uniquely
determine the energy of the atom, and that there are several angular momentum
states which correspond to the same energy. This is an example of what is
known as the degeneracy of an energy level. However, inclusion of the relativistic
corrections shows that these different angular momentum states have slightly
different energies. This results in the multiplicity of the corresponding spectral
lines. Such a fine structure of the lines (the structure is narrower for larger n
values), is indeed observed experimentally, but the quantitative predictions of
the simple model are not in agreement with the experimental observations.
The Bohr theory of the atom is essentially a theory of single-electron atom.
It does not allow a simple generalization to many-electron atoms, not even to
helium, and being ad hoc, it is not logically consistent. But its picture of an atom
with quantized orbits for the electrons, has retained its utility till today, especially
for qualitative arguments.
The existence of discrete atomic energy levels can be observed
experimentally from an analysis of collisions between atoms and electrons with
known energy. In these collisions, since the mass of the atom is much larger
than that of an electron, very little energy is carried away as kinetic energy of
the atom. However, if the energy of the electron is sufficient to raise a bound
electron to a higher energy orbit, the electron may transfer most of its energy to
the atom. This phenomenon was demonstrated by Franck and Hertz (1914).
Electrons from a filament are gradually accelerated through a vapour in a tube
[Fig. 2.8 (a)], towards a grid G and are subjected to a small retarding potential
V0 between the grid and the plate P. When the accelerating potential is sufficiently
large to excite an atom, the electron may undergo a collision near G and transfer
most of its energy to the atom.
56 Elements of Modern Physics

Fig. 2.8 (a) A schematic diagram of Franck-Hertz experiment.


(b) Variation of current I (A) with voltage V in volts, for mercury vapour.

It will then be unable to reach the plate P. Thus, as the accelerating potential
V is raised from zero, the current arriving at P will increase. When it is just
greater than the excitation potential for the atoms, there is a sharp decrease in
the current [Fig. 2.8 (b)]. It begins to increase again till the next energy level
can be excited, and falls again. That the atoms are indeed excited is confirmed
by the appearance of the corresponding spectral lines in the radiation emitted as
the electrons fall back to the lower energy levels.

2.8 EXAMPLES
A few examples that provided some details and extensions of the ideas discussed
are now given.

Example 1
For obtaining the expression for the energy radiated by a unit area per unit time,
given in Eq. (2.13), it is noted that the energy crossing a unit area, per unit time,
per unit frequency is
dU
dv ∫
= vz dE (2.69)

where the direction perpendicular to the area is taken as the z direction. Now
1
dE = u (v) d cos θ, where u(v) is the energy density, vz is c cos θ, and the
2
1
range of integration is from θ = 0 to π , so that
2
dU 1 1
dv 2 0

= c u (v) cos θ d cos θ = c u (v)
4 (2.70)
Introduction to Quantum Ideas 57

Example 2
The position of the maximum in Eq. (2.17) is determined numerically by iteration.
Equating the derivative of u(λ) to zero, one gets the condition
x = 5 (1 – e–x) (2.71)
where x = hc/λkT. Inspection suggests that the solution to this is close to x ≈ 5.
First iteration gives
x ≈ 5 (1 – e–5)
= 4.9663 (2.72)
while the second iteration gives
x ≈ 5(1–e–4.9663)
= 4.96516 (2.73)
which agrees with Eq. (2.18).

Example 3
An experiment on the photoelectric effect of a metal gives stopping potentials
of 4.62 V for λ = 1850 Å, and 0.18 V for λ = 5460 Å. These results can be used
to calculate the Planck’s constant and the work function of the metal. From
Einstein’s relation in Eq. (2.23),

hc
= eφ + eV0 (2.74)
λ
which on using the experimental values, leads to two linear equations in h and φ.
Solving them, we get h = 6.64 × 10–34 J s and φ = 2.1eV.

Example 4
The wavelength of x-rays scattered by bound electrons (Sec. 2.3) has a spread,
mainly due to the fact that the bound electron has a momentum distribution. For
estimating the correction due to the non-zero initial momentum, it is noted that
the binding energy of the electrons is usually quite small, about 10 eV, compared
to the x-ray energies of about 10 keV.
The momentum and energy conservation relations give

(p f − p i ) 2 =
h2
c2
( 2
ν 0 + ν 2 − 2ν 0 ν cos θ ) (2.75)

h ν 0 + mc 2 − ∆E = hν + ( p f 2c 2 + m2c 4 )1/ 2 (2.76)


Here pi and pf are the initial and final momenta of the electron, and ∆E is
the binding energy. Proceeding as in Sec. 2.3
58 Elements of Modern Physics

h λ λ 1 2 
λ − λ0 = (1 − cos θ) + 0  p f ⋅ pi − pi + ∆EM  (2.77)
mc hmc  2 

where ∆E has been neglected compared to mc2. For x-ray energies of a few
tens of keV and binding energies of the order of a few eV, the second term on
the right hand side is smaller than the first term. Because of the variation of pi,
this term gives rise to a spread in the frequency of the scattered beam
[see Fig. 2.3 (a)].

Examples 5
In the Davisson-Germer experiment, the x-ray beam was incident normally on
the surface AB (see Fig. 2.9). The condition for coherent, maximum reflection
is
d′ sin θ = mλ (2.78)
where m is a positive integer. It can be shown that the Bragg condition reduces
to this condition.
For Bragg reflection, the incident and reflected beams make equal angles
with the reflection plane AC, the angles being (π – θ)/2. The Bragg condition
for coherent reflection is

(p – q)
q
2

A B

d
q/2
D d¢ C

Fig. 2.9 Relation between the Davisson-Germer


condition and the Bragg condition

1
2d sin (π − θ) = nλ (2.79)
2
But d = d′ sin (θ/2), so that the Bragg condition becomes
2d′ cos (θ/2) sin (θ/2) = nλ (2.80)
which is the same as Eq. (2.78) if n = m.
Introduction to Quantum Ideas 59

Example 6
The orbit for the motion of a particle in a Coulomb potential can be derived by
considering the change in the momentum of the particle.
For a particle with impact parameter b (see Fig. 2.6) and scattered at an
angle θ, the change in the momentum is

1
∆p = 2mv cos (π − θ)
2

= ∫ F dt
−∞
p



= F cos φ
φ
(2.81)

where Fp is the component of the force parallel to ∆p and φ is the angle between
the position vector r and ∆p (see Fig. 2.6). But

Ze (2e)
F= (2.82)
4π ε 0 r 2
and mvb = mr2φ (2.83)
which follows from the conservation of angular momentum.
Therefore
Ze 2
2mv sin (θ/ 2) = cos (θ / 2) (2.84)
πε 0vb
which leads to
Ze 2
b= cot (θ / 2) (2.85)
2πε 0 mv 2
used in Eq. (2.52).

Example 7
The Bohr model was generalized by Sommerfeld (1916) to include non-circular
elliptic orbits. The generalized quantum conditions are

z pφ rdφ = nh, n is an integer (2.86)

z pr dr = kh, k is an integer (2.87)


60 Elements of Modern Physics

were p⊥ and pr are components of momentum perpendicular and parallel to the


radius, respectively, and the integrals are over the complete orbit. The first
condition reduces to the Bohr relation,
L = n (2.88)
The total energy is
1 L2 Ze 2
E= pr2 + − (2.89)
2mr 2mr r 2 4πε0 r
Using Eqs. (2.89) and (2.88), the quantization condition in Eq. (2.87) becomes
rmax

∫ dr (2m E − n 
2 2
2 r / r 2 + mr Ze 2 /2πε0 r )1/ 2 = kh (2.90)
rmin

The integral can be evaluated using the theory of complex variables and
leads to
1/ 2
Ze 2  mr 
− − nh = kh (2.91)
2ε0  2 E 
so that
2
mr  Ze 2  1
E=−   , (k + n) ≥1 (2.92)
2 2  4 πε 0  ( k + n) 2

Thus, for a given value of the principal quantum number (k + n), states with
the same energy exist, for k = 0, ..., k + n – 1. Thus we have what is called as
a degeneracy of order k + n (see. Sec. 3.4).

Example 8
Suppose in addition to the Coulomb attraction, there is a potential energy terms
g/r2. Then the expression for the total energy E is

1 L2 g Ze 2
E= pr2 + + − (2.93)
2mr 2mr r 2 r 2 4πε 0 r

Since L2 = n2  2, this is equivalent to replacing n  by (n2  2 + 2mrg)1/2.


Therefore, the expression for the quantized energy is
2
m  Ze2  1
E = − r2   (2.94)
2  4πε0  (k + (n + 2mr g /  2 )1/ 2 ) 2
2
Introduction to Quantum Ideas 61

The energy levels are now different for a given k + n but different k or n
values. Thus, the degeneracy due to different angular momentum states is
removed by the addition of the potential g/r2. Indeed, the Coulomb degeneracy
is removed by any additional interaction.

Example 9
A quick estimation of the binding energies of the hydrogen atom is obtained by
the following simple argument.
A stationary state may be thought of as one for which an integral number of
de Broglie wavelengths can be fitted over the orbit, e.g. for circular orbits
2πr = n (h/p), n = 1, 2, ... (2.95)
Using this relation, the total energy is

1 Ze 2
E= p2 −
2mr 4πε0 r

n22 Ze 2
= − (2.96)
2mr r 2 4πε 0 r
If the state is stable, it corresponds to a minimum of this energy:

dE n22 Ze 2
=− 3
+
dr mr r 4πε 0 r 2
=0 (2.97)
so that

n22  4πε 0 
rmin =  2  (2.98)
mr  Ze 
This leads to the energy levels
2
mr  Ze2 
En = − 2 2   (2.99)
2 n  4πε0 

PROBLEMS
1. If the continuum spectrum of the sun approximates that of a black body,
peaking at λm ≈ 5000 Å, what is the surface temperature of the sun? What
can you infer about the temperature of the material surrounding the sun
from the observation of Balmer absorption lines in the spectrum?
62 Elements of Modern Physics

2. A spherical object of radius 10 cm and whose surface approximates that


of a black body is maintained at a temperature of 1000 K. What is the
energy radiated by the body per second? Estimate the energy radiated in
the visible range.
3. Show that the number of photons per unit volume of a black body cavity,
per unit frequency, is
8πν 2
N (ν ) = .
c3 (e hv / kt − 1)

4. A small 10 W source of ultraviolet light of wavelength 1000 Å is held at a


distance of 0.1 m from a metal surface. If the radius of an atom is
approximately 0.5 Å, how many photons strike an atom per minute? If the
efficiency, i.e., the fraction of the photons that succeed in knocking out the
photons, is 1%, how many electrons are ejected from a unit area per second?
5. In a photoionization experiment, let ε be the binding energy and let the
photon with frequency v be incident. Assuming that the recoil energy of
the atom is small, show that the electron comes out with a momentum
approximately equal to [2m(hv – ε)]1/2. Obtain an expression for the recoil
energy of the atom when the electron comes out at an angle of φ with
respect to momentum of the photon.
6. Show that the recoil electron in Compton scattering always comes out in
the forward hemisphere, i.e., recoil angle is less than or equal to 90°.
When is the angle between the scattered photon and the recoil electron, a
maximum? What is the corresponding recoil energy of the electron? Assume
hv0 < mc2 for the second part.
7. An x-ray of wavelength λ = 0.1 Å is scattered by an electron. What are
the maximum and minimum wavelengths of the scattered photon? At what
angle will the scattered photon have a wavelength of 0.11 Å?
8. An electron has a de Broglie wavelength equal to that of a photon. Compare
its kinetic energy with the energy of the photons? What is the limiting
value of this ratio for low energy photons? If the photon has an energy of
100 keV, what is the kinetic energy of the electron?
9. What is the de Broglie wavelength of neutrons at room temperature? Can
they be used to study crystal structure?
10. A beam of 0.25 eV neutrons incident on a nickel surface is found to produce
a maximum at an angle of 70° with the initial beam direction, but no other
maxima are produced for angles greater than 70°. Determine the order of
the maximum and the interplanar distance for the crystal.
Introduction to Quantum Ideas 63

11. When an electron enters a crystal, it is accelerated towards the interior


because of an inner potential due to the positive charge of the ions in the
crystal. For an electron of speed u, and an inner potential Vi volts show
that the bending of the electron beam on entering the crystal is described
by a refractive index µ = (1 + 2eVi/mu2)1/2. Show also that, in this case, the
Bragg relation is modified to read 2d (µ2 – cos2 θ)1/2 = nλ.
12. Assume that a particle can be confined to a spherical volume only if its
circular orbit can be fitted with an integral multiple of de Broglie wave-
lengths. Estimate the minimum kinetic energy of a proton confined to a
nucleus of diameter 10–14 m. What would be the kinetic energy of an
electron similarly confined?
13. Use the arguments of example 9 to obtain a rough estimation of the energy
levels of a 3-dimensional harmonic oscillator.
14. Calculate the distance of closest approach of a 5 MeV a particle in a
head-on collision with a gold nucleus. What is the upper bound of the
α-particle energy for which the Rutherford formula is expected to be
valid? Take the radius of the gold nucleus to be about 7 × 10–15 m and that
of the α-particle to be about 2 × 10–15 m.
15. Show that the fraction of the incident α particles scattered through an
angle between θ1 and θ2 is given by
2
 2Ze 2 
πρt  2 
[cosec 2 (θ1 / 2) − cosec 2 (θ2 / 2)]
 4πε0 mv 
where the terms is defined in Sec. 2.6.
16. The value of R in Eq. (2.63) is found to be 1.0967758 × 107 m–1 for the
hydrogen and 1.0972227 × 107 m–1 for 4He+. From the value of mH/mHe
≈ 0.2517, estimate the value of me/mH.
17. One of the spectral lines of hydrogen with wavelength 4861.320 Å is
accompanied by another line of wavelength 4859.975 Å. Assuming that
this is due to the presence of the heavier isotope deuterium, obtain the
ratio of the deuterium mass to the proton mass.
18. Apply the Bohr quantization condition to obtain the energy levels of
(i) a 3-dimensional harmonic oscillator, (ii) a particle in a potential
V(r) = –g/rs, s > 0.
19. A charged particle, moving in the presence of a magnetic field in the
z-direction, has circular orbits in the xy plane. Apply Bohr’s quantization
condition to obtain the energy levels of the particle.
64 Elements of Modern Physics

20. Use the Sommerfeld quantization condition z


levels of a one-dimensional harmonic oscillator.
pxdx = nh to obtain the energy

21. A free atom undergoes a transition from an energy level E1 to E2 by


absorbing a photon. Show that the frequency of the photon absorbed is

( E2 − E1 )  E2 − E1 
ν= 1 + 
h  2m1c 2 

where m1 is the mass of the initial atom. The recoil correction given by the
second term is of the order of 10–8 , or smaller, for the hydrogen atom. In
Mössbauer effect, the atom is embedded in a crystal so that its effective
mass is that of the crystal. As a result, the second term is negligible and
one has recoil-less absorption of photons.
22. In a Franck-Hertz experiment, hydrogen atoms are bombarded with
electrons. What are the wavelengths of the emission lines observed when
the electrons are accelerated through a potential difference of 12.5V?
23. Assuming that the earth is a black body in equilibrium at a temperature of
300 K, estimate the temperature of the sun.
24. Suppose that man’s power production reaches 20% of the power received
from sunlight (this would happen in about 250 years if the present exponential
growth continues). What is the expected approximate increase in the surface
temperature of the earth?
3
Elements of Quantum Theory

Structures of the Chapter


3.1 A thought experiment
3.2 The wave function
3.3 Postulates of quantum mechanics
3.4 Some properties of observables and wave functions
3.5 Free particle
3.6 Wave packet and the uncertainty principle
3.7 A step potential
3.8 Particle in a box
3.9 Simple harmonic oscillator
3.10 Small perturbations
3.11 Angular momentum
3.12 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 65


S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_3
66 Elements of Modern Physics

ln Chapter 2, we have discussed some experiments which indicate that radiation


should be regarded as being made up of zero-mass particles called photons
hv
each of which carries a quantum of energy hv and momentum n . It was
c
also observed that particles with nonzero mass, such as the electron, have wave
properties associated with them, and produce diffraction patterns similar to those
produced by radiation. Thus, a situation emerges in which radiation and matter
exhibit both particles properties, i.e., they carry quantum of energy and
momentum, and wave properties, i.e., they produce diffraction patterns.
Here, the basic laws governing the wave-particle behaviour of matter will
be analysed and applied to some simple cases. In deducing the laws, we will
borrow heavily from the wave properties of radiation and particle properties of
matter particles, with which one is familiar.

3.1 A THOUGHT EXPERIMENT


Consider a thought experiment in which a beam of particles is incident on two
closely-spaced narrow slits S1 and S2, as shown in Fig. 3.1 These slits act as
two coherent sources and produce interference fringes on a screen S placed at
a distance so far away that the separation between the slits is negligible compared
to their distance from the screen. This experiment is Young’s double slit experiment
for radiation, but here it is reanalysed in terms of the particle which constitute
the beam.
The intensity variation on the screen is given by the familiar interference
fringes. The interpretation of the interference fringes in terms of the particles is
that the intensity is proportional to the number of particles arriving at various
points, with no particles coming to regions with zero intensity. It is to be noted
that the interference fringes are not due to any interaction between the particles
coming from different slits. Indeed, the fringe pattern is independent of the
intensity of the initial beam, so that if the particles come only one at a time, they
will still avoid the dark spots and the frequency of their arrival is proportional to
the intensity (diffraction experiments with electrons coming essentially one at a
time have demonstrated this). If on the other hand, one of the slits is closed, the
fringes disappear, and one has more or less a uniform intensity. The question
that comes up is, how do the particles that are coming one at a time, know the
existence of both the slits, which persuades them to come more frequently at
the bright spots and avoid the dark spots on the screen.
The wave theory explanation of the interference pattern is that the amplitude
of the wave at any point on the screen is a superposition of the amplitude of two
Elements of Quantum Theory 67

coherent waves coming from S1 and S2, and the intensity at a point on the
screen is given by the modulus squared of the resultant amplitude:
I = |ψ (S1) + ψ (S2)|2 (3.1)
2 πi ( vt – l1 / λ )
= | Ae + Ae 2 πi ( vt − l2 / λ ) |2 (3.2)
where v is the frequency, λ is the wavelength, and l1 and l2 are the distances of
the point on the screen from the two slits. Therefore, the intensity of the
interference pattern at any point is given by

πax 
I = 4 | A |2 cos 2   (3.3)
 λd 
where a, x and d are as shown in Fig. 3.1 and the path difference is approximately
(ax/d). If either of the slits is closed, the interference fringes disappear and a
uniform intensity distribution of I = |A|2 results.
S

S1 x

a
d
S2

Fig. 3.1 Two slit interference experiment for particles.

3.2 THE WAVE FUNCTION


The observation of the interference fringes, even when particles are coming
one at a time, forces us to associate a wave function ψ with the particle. A
superposition of allowed wave functions is also a possible wave function of the
particle, e.g.
ψ (S1 + S2) = ψ (S1) + ψ (S2) (3.4)
where ψ (S1), ψ (S2) and ψ (S1 + S2) are the wave functions when slits S1, S2
and S1 + S2 are open, respectively. The frequency of arrival at the screen is then
explained if it is postulated that the probability for a particle to be found in a
volume dV is proportional to | ψ |2 dV. Implicit in this association of a wave
function with the particle, is the implication that the position of the particle cannot
68 Elements of Modern Physics

be predicted precisely, in a single observation. Only a probabilistic prediction


can be made that if the experiment is repeated a large number of times, the
frequency of a particle being found in different regions is proportional to the
square of the wave function associated with each particle. The inability, in general,
to predict the results of individual events is inherent in quantum mechanics and
is called the principle of indeterminacy.
The main problem in quantum mechanics is the determination of the wave
function. For obtaining the rules which govern the determination of the wave
function, we again invoke our experience with wave functions which describe
radiation and therefore, photons. It was shown in Chapter 1 that the scalar and
vector fields of electromagnetic radiation in a region where there are no sources,
satisfy (see Eq. 1.67) equations of the type

 2 1 ∂2 
∇ − 2 2  ψ = 0 (3.5)
 c ∂t 
where ψ stands for one of the fields. Such an equation is also satisfied by the
electric and magnetic fields as can be seen by applying the curl operator to
equations (1.60c) and (1.60d) and using the other Maxwell equations in
Eq. (1.60). Furthermore, the plane-wave solutions to this equation are of the
form
ψ (r, t) = Ae–2πi (vt – k.r) (3.6)
v
where v is the frequency and k = n with n being a unit vector in the direction
c
of propagation. It is noted that hv is the energy of the photon and hk is its
momentum. Therefore, the wave function of the photon is of the form

ψ (r, t) = Ae − i ( Et −p.r ) /  (3.7)


Actually, for radiation, the polarization vector has also to be considered,
which for the present discussion is not relevant. Substituting this function in
Eq. (3.5),

 1  E 2  p2 
 2  2 − 2 ψ = 0 (3.8)
c     
which is obviously true since for a photon (mass = 0), E2 – p2c2 = 0 (using the
notation p.p = p2 = p2).
Suppose we knew first about the photon and wanted the equation which
would determine its wave function. Then one could start with the valid relation

 1  E 2  p 2 
 2    − ψ = 0 (3.9)
 c     
Elements of Quantum Theory 69

It is now postulated that the wave equation would be obtained by the


substitution

E = i (3.10)
∂t
p = −i  ∇ (3.11)
in Eq. (3.9). The choice of the coefficients is dictated by the requirement that
hv
the operation of these operators on ψ (r, t) in Eq. (3.6) should give hv and
n
c
which are the energy and momentum of the photon. In any case, the relative
coefficients are determined by the requirement of relativistic covariance
 1   1 ∂
since  p, E  and  −∇, both transform as a relativistic 4-vector
 c   c ∂t 
[see Eq. (1.48)].
The prescription for getting the wave equation is now clear. One starts with
the classical expression for the total energy in terms of momentum and position,
multiplies it by the wave function and converts it into a wave equation by the
substitutions in Eqs. (3.10) and (3.11). It is worth noting that the operator relation
in Eq. (3.10) may be used for energy, which either includes or does not include
the rest energy term, without changing the essential results since adding a constant
to energy in this case only redefines the zero of the energy.
For a nonrelativistic free particle,
 E − 1 p2  ψ
 = 0 (3.12)
 2m 
which with the substitutions in Eqs. (3.10) and (3.11) leads to
∂ψ (r, t ) 2 2
i = − ∇ ψ (r, t ) (3.13)
∂t 2m
This is the celebrated Schrödinger equation for the free particle. It is equally
suggestive that for a particle in the presence of an interaction potential, the
relation is
 E − 1 p 2 − V (r )  ψ (r , t )
  = 0 (3.14)
 2m 
Substitution of Eqs. (3.10) and (3.11) then gives
∂ψ (r, t )  2 2 
i =  − 2m ∇ + V (r )  ψ (r, t ) (3.15)
∂t  
which is the Schrödinger equation for a nonrelativistic particle in the presence
of an interaction potential.
70 Elements of Modern Physics

3.3 POSTULATES OF QUANTUM MECHANICS


The discussion so far has been for the purpose of introducing the ideas of wave
function and wave equation, and making them plausible. These ideas are now
formalized and generalized in terms of some postulates of quantum mechanics.
Postulate 1: Every state of n particles is described by a wave function
ψ (ri, t), i = 1, ..., n where ri are the coordinates of the n particles, such that the
probability at time t, of finding the particles in respective volumes d3ri about the
points ri, is
dP = |ψ (ri, t)|2 d3 r1...d3 rn (3.16)
It is relevant to comment here that:
1. Since the particles must be found somewhere, the total probability must
be 1,
∫dP = ∫ |ψ (ri, t)|2 d3r1...d3rn
=1 (3.17)
This defines the normalization of the wave function. It is also clear from
the definition of probability that the average value of a function f (ri) is
〈 f (ri) 〉 = ∫ |ψ (ri, t)|2 f(ri) d3ri ... d3rn (3.18)
2. For a single particle system, the results simplify to
dP = |ψ (r1, t)|2 d3r1
1 = ∫ |ψ (r1, t)|2 d3r1 (3.19)
〈 f (r1) 〉 = ∫ | ψ (r1, t)|2 f(r1) d3r1 (3.20)
In particular, the average position is given by
〈 r1 〉 = ∫ |ψ (r1, t)|2 r1d3r1 (3.21)
Postulate 2: The wave function ψ (ri, t) satisfies the Schrödinger equation
∂ψ (ri , t )
i = H (ri , − i∇ i ) ψ (ri , t ) (3.22)
∂t
where H is the Hamiltonian or the energy operator obtained from the classical
expression for the total energy by replacing pi by- i ∇ i . For a single article in
1 2
the presence of an interaction potential, the total energy is E = p + V (r )
2m
which leads to the equation

∂ψ (r, t ) 2 2
i = − ∇ ψ (r, t ) + V (r ) ψ (r, t ) (3.23)
∂t 2m
Elements of Quantum Theory 71

This equation is valid if the system is nonrelativistic. It requires some


modifications if the particle has additional variables such as intrinsic angular
momentum called spin. If the system has a well-defined energy, its Schrödinger
equation reduces to its simpler time-independent form

2 2 ∂ψ (r, t )
− ∇ ψ (r, t ) + V (r ) ψ (r , t ) = i = Eψ (r, t) (3.24)
2m ∂t
Postulates 1 and 2 allow us to deduce average values of general dynamical
variables. Equation (3.24) on being multiplied by ψ*(r, t) and integrated over the
entire space and on rearrangement of terms leads to


∫ψ* (r, t) V (r) ψ (r, t) d3r = ∫ψ* (r, t)  i  ψ (r, t) d3r
 ∂t

 2 2 
– ∫ ψ* (r, t)  − ∇  ψ (r, t) d3r (3.25)
 2m 
The term on the left hand side, as seen from Eq. (3.21), is the average
potential energy. It is therefore reasonable to identify the terms on the right
hand side as the average total energy and the average kinetic energy:


〈 E 〉 = ∫ ψ* (r, t)  i  ψ (r, t) d3r (3.26)
 ∂t 

1 2  2 2 
〈 p 〉 = ∫ψ* (r, t)  − 2m ∇  ψ(r, t) d r
3
(3.27)
2m  

E
Now, since  p,  transforms as a 4-vector, the relationships in Eqs. (3.10)
 c
and (3.11) suggest that, in addition to Eq. (3.26)
〈 p 〉 = ∫ ψ* (r, t) (− i ∇ ) ψ (r, t) d3 r (3.28)
These results are generalized in the following postulate.
Postulate 3: The average values of E and the dynamical variable F (p, r) are
given by


〈 E 〉 = ∫ ψ* (r, t)  i  ψ (r, t) d3r (3.29)
 ∂t 

and 〈 F (p, r) 〉 = ∫ ψ* (r, t) F (− i ∇, r ) ψ (r, t) d3 r (3.30)


In quantum mechanics, the choice of dynamical variables which are
dynamical observables is not obvious. A minimal requirement is imposed on the
72 Elements of Modern Physics

dynamical observables that their average values must be real. It is easy show
from Eq. (3.30) that both r and p have real average values and are acceptable
as dynamical observables. On the other hand, xpx is not an observable though
the angular momentum r × p can be shown to have a real average value and
hence is acceptable as a dynamical observable.
It is often convenient to work with the fourier transform of the wave function
rather than with the wave function itself. Writing ψ (r, t) as
1
ψ (r, t) = ∫ f (k , t ) exp (ik .r /  ) d 3k (3.31)
h3/ 2
the inverse fourier transform is
1 3
f (k, t) = ∫ ψ (r , t ) exp (−ik .r /  )d r
3/ 2 (3.32)
h
The fourier transform f (k, t) is called the wave function in the momentum
space. It is easy to show that
∫ |ψ (r, t)|2 d3r = ∫ |f (k, t)|2 d3k (3.33)
and that the average value of momentum given in Eq. (3.28) reduces to
〈 p 〉 = ∫ |f (k, t)|2 k d3k (3.34)
which justifies the definition of f (k, t) as the wave function in the momentum
space.

3.4 SOME PROPERTIES OF OBSERVABLES AND WAVE


FUNCTIONS
In this section, some important properties of quantum mechanical wave functions
and observables are described.
It was noted that the average values of dynamical observables must be
real. This requirement is satisfied if the operator A corresponding to the
observable (the operator is obtained from the appropriate classical variable by
the replacement of p by – i(∇) satisfies the condition
∫ φ* Aψdτ = (∫ψ* Aφdτ)* (3.35)
where dτ represents a volume element. Taking φ = ψ gives the result that the
average value is real. An operator A which satisfies this condition is said to be
hermitian.
Hermitian operators have some special properties, which are mentioned
here. Consider an operator equation
Aφn = E n φ n (3.36)
where En is a constant. This is an example of what are called eigenvalue equations,
En being the eigenvalue of operator A and φn the corresponding eigen-function.
Elements of Quantum Theory 73

It is natural to interpret this equation as implying that φn describes a state with a


well-defined value En for the observable corresponding to A. The eigenvalues
and the eigenstates of hermitian operators satisfy the following properties:
1. The eigenvalues of a hermitian operator are real. This follows from
Eq. (3.35), If φ and ψ are taken to be the same eigenstates.
2. Eigenstates with different eigenvalues are orthogonal in the following sense.
In Eq. (3.35), if φ and ψ are eigenstates φn and φm of A with eigenvalues
En and Em respectively, then
(En – Em) ∫φn* φm dτ = 0 (3.37)
or ∫ φn* φm dτ = 0 for En ≠ Em (3.38)
The states φn and φm are said to be orthogonal to each other. It is also
possible in the case of discrete eigenvalues to normalize the eigenstates
such that
∫φn* φn = 1 (3.39)
The states which satisfy the relations (3.38) and (3.39) are said to be
orthonormal. It may happen that there are more than one states which
have the same eigenvalue. These states are said to be degenerate and
the number of degenerate states is known as the degree of degeneracy
of the eigenvalue. It is possible to normalize these states and to choose a
suitable, orthonormal set of degenerate states.
3. The eigenstates of a hermitian operator are complete, and form a complete
basis. This means that any state ψ can be expressed as a linear combination
of the eigenstates φn of a hermitian operator,

ψ= ∑an
n φn (3.40)

The summation may include an integration over a set of states with


continuum eigenvalues.
4. Two operators A and B are said to commute if
[A, B] ≡ AB – BA = 0 (3.41)
where [A, B] is called the commutator of A and B. It is possible to choose,
as a basis, states which are simultaneous eigenstates of commuting hermitian
operators. A particularly important case is obtained if one of the operators
is the Hamiltonian (i.e. energy operator),
[H, B] = 0 (3.42)
Then the states can be chosen to be simultaneous eigenstates of H and B.
Since the eigenstates of a time-independent Hamiltonian do not change
74 Elements of Modern Physics

with time (except for the phase factor), this means that the eigenvalues of
B for these states do not change with time, and therefore B is conserved.
The solutions to the Schrödinger equation in some simple situations are now
discussed.

3.5 FREE PARTICLE


A free particle is one on which there are no forces acting. The Schrödinger
equation for the free particle is

∂ψ 2 2
i = − ∇ψ (3.43)
∂t 2m
Separating the variables, the solution to Eq. (3.43) can be written in the
form
ψ (r, t) = f (t) φ(r) (3.44)
Substituting this in Eq. (3.43) and dividing the equation by ψ (r, t) gives

1 ∂f (t ) 2 1
i = − ∇ 2 φ( r ) (3.45)
f (t ) ∂t 2m φ(r )
which can be satisfied only if both the sides are constant, say E. Then

∂f (t )
i = Ef (t) (3.46)
∂t

2 2
− ∇ φ(r ) = Eφ (r) (3.47)
2m
Equations (3.46) and (3.47) are eigenvalue equations for the energy, E being
the energy eigenvalue and f (t), φ (r) being the corresponding eigenfunctions.
They describe a state with a well-defined energy E.
The solution to Eq. (3.46) is
f (t) = exp (– iEt/  ) (3.48)
except for an overall constant which will be included in φ(r). For solving
Eq. (3.47), once again a separable form is assumed for φ (r),
φ (r) = A (x) B (y) C(z) (3.49)
Substituting this in Eq. (3.47) and dividing by φ (r) gives

 2  1 d A ( x) 1 d C ( z) 
2 2 2
1 d B( y )
− + + = E (3.50)
2m  A( x) dx 2 B( y ) dy 2 C ( z ) dz 2 
Elements of Quantum Theory 75

For this relation to be valid, each of the three terms in Eq. (3.50) should be
a constant. Introducing constants kx, ky, kz one gets

d 2 A ( x) k x2
= − A ( x) (3.51)
dx 2 2

d 2 B( y ) k y2
= − B ( y) (3.52)
dy 2 2

d 2C ( z ) k2
2 = − z2 C ( z ) (3.53)
dz 

1
with E= (k 2 + k y2 + k z2 ) (3.54)
2m x
Solutions to these equations finally lead to

i
ψ (r, t) = β exp  − ( Et − k ⋅ r )  (3.55)
  
with the constants E, k satisfying the condition in Eq. (3.54). The following
points should be noted about this solution:


1. Since the operators corresponding to energy and momentum, i and
∂t
– i ∇ , operating on this solution give the wave function back but multiplied
by constants E and k respectively, the solutions describe a particle with
energy E and momentum k.
2. The solution is not normalizable [see Eq. (3.17)] since |ψ| = |β| and
∫ |ψ|2 dV = ∞. Nevertheless, it can be used for describing relative
probabilities, the probability of finding the particle anywhere being the
same. The wave function can be interpreted as describing a beam of
noninteracting particles with momentum k, and with |β|2 number of particles
per unit volume.
3. Since Eq. (3.43) is linear, any superposition of solutions in Eq. (3.55) is
also a solution of Eq. (3.43), i.e. the general solution can be written as

1  i 
∫ exp −  ( Et − k ⋅ r) F (k ) d k
3
ψ (r, t) = 3/ 2 (3.56)
h
with E given by Eq. (3.54).
76 Elements of Modern Physics

3.6 WAVE PACKET AND THE UNCERTAINTY PRINCIPLE


A wave packet is superposition, as in Eq. (3.56), of plane-wave solutions with
nearly the same momenta, so as to give a wave function which is localized in
space. Such a wave function may be written in the form

1 i
ψ (r, t) = ∫ f (k − k 0) exp  − ( k 2t / 2m − k ⋅ r )  d 3k (3.57)
h3/ 2   
where f (k – k0) is significantly nonzero only in a small region about k ≈ k0.
There are some general properties of the wave packet which are demonstrated
here by taking the Gaussian form for f (k – k0). For

1  (k − k 0 ) 2 
f (k – k0) = 3/ 4 3/ 2
exp  −  (3.58)
π (b)  2 2b 2 

the wave packet ψ (r, t) is obtained from Eq. (3.57) by changing the variable of
integration to q = k – k0, and integrating

 i  k02 
ψ (r, t) = exp  −   2m t − k0 .r   u (r , t ) (3.59)
   

 b2  k 
2

b3/ 2 exp  −  r − 0 t  / (1 + itb 2 / m) 
 2  m  
u (r, t) = 3/ 4 2 3/ 2 (3.60)
π (1 + itb / m)

Thus, the wave packet is a product of a plane wave with momentum k0,
k0
and an envelope which is peaked at r = t. The phase moves with velocity
m
k0.2m, which is called the phase velocity, and the envelope moves with velocity
k0/m, which is called the group velocity. Since the envelope determines the
location of the particle, it is the group velocity which corresponds to the classical
velocity of the particle.
The wave packet brings out an important principle regarding the determination
of the position and momentum of a particle. It can be seen from Eq. (3.60) that
1
the wave packet at t = 0 is significantly nonzero only for | x | < , so the spread
 b
in the x-component of position of the particle is

1
∆x ≈ (3.61)
b
Elements of Quantum Theory 77

Similarly, f (k – k0), which is essentially the wave function i the momentum


space, has nonzero values only for |(k – k0)x| < b , so that the spread in the

x-component of momentum of the particle is
∆px ≈ b (3.62)
From Eqs. (3.61) and (3.62), one gets
(∆x) (∆px) ≈  (3.63)
Similar results are valid for the measurements of y or z components. Thus,
there is an inherent uncertainty in the determination of the position and
momentum of a particle. The position and momentum of a particle cannot be
simultaneously determined with infinite accuracy. The product of the uncertainties
or allowed errors in their measurements must satisfy the Heisenberg uncertainty
principle whose special case is stated in Eq. (3.63). According to the Heisenberg
uncertainty principle, the product of the uncertainties in the values of two
canonically conjugate variables whose operators are hermitian, cannot
be less than h– in the order of magnitude. Examples of canonically conjugate
variables are (x, px), (y, py) and (z, pz). A similar relation for time and energy
results from the analysis of the response of a state to a time dependent
interaction,
(∆t) (∆E) ≈  (3.64)
The interpretation of this relation is that if it takes time ∆t to measure the
energy of a system, there is an inherent uncertainty in the measured value of the
energy, given by Eq. (3.64). In particular, this relation implies that unstable particles
with lifetime τ have an associated uncertainty in their energy, of order  / τ .
The Heisenberg uncertainty principle can be made into a quantitative statement
if ∆x, ∆px, etc. are defined as the standard deviations, i.e. ∆x = 〈( x − x ) 2 〉1/ 2 ,

∆px = 〈( px − px ) 2 〉1/ 2 , etc. It can then be shown rigorously that

(∆x) (∆px) ≥  / 2 (3.65)


a result which is valid for pairs of canonically conjugate, hermitian operators
(see e.g. Ref. 18).
The Heisenberg uncertainty principle can be easily demonstrated by the
thought experiment of Sec. 3.1, which may be regarded as an experiment for
determining the position and the momentum of the particle. The position of the
particle in the experiment has an uncertainty of
∆x ≈ a (3.66)
since it is not known whether the particle passed through slit S1 or S2. Similarly,
the momentum of the particle also is undetermined to the extent
78 Elements of Modern Physics

w
∆px ≈ p (3.67)
d
where w is the width of the central fringe. This uncertainty results from the fact
λd
that the particle may come to any point within this fringe. Since w = , and
a
p = h/λ by the de Broglie relation, we get
(∆x) (∆px) ≈ h (3.68)
which is the same as Eq. (3.63) in order of magnitude. This demonstration
brings out the fact that the Heisenberg uncertainty principle is essentially a
consequence of associating wave properties with the particles.

3.7 STEP POTENTIAL


As the first example of nonzero potentials, the one-dimensional problem of a
article which comes across a sudden change in the potential is considered. The
potential may be approximated by
V(x) = 0 for x < 0
= V for x ≥ 0 (3.69)
as shown in Fig. 3.2(a).

Fig. 3.2 (a) A step potential, (b) A potential barrier.


This Schrödinger equation in one dimension, is

∂ψ ( x, t )  2 ∂ 2 
i =  − + V ( x )  ψ ( x, t ) (3.70)
∂t  2m ∂x
2

As before, for states with energy E,

y (x, t) = exp (− iEt / ) φ ( x) (3.71)


where φ (x) satisfies the equation [see Eq. (3.24)]
Elements of Quantum Theory 79

2
 2 d φ ( x)
− = [E – V (x)] φ(x) (3.72)
2m dx 2
The solution for φ (x), for x < 0, is
φ (x) = a+eipx + a– e–ipx, x < 0 (3.73)

1
with p= (2mE )1/ 2 (3.74)

where the first term in the solution corresponds to a particle with momentum
p , and the second term to a particle with momentum – p . The solution for
x ≥ 0 is
φr (x) = b+ eiqx + b–e–iqx, x ≥ 0 (3.75)

1
with q= [2m ( E − V )]1/ 2 (3.76)

Now, since the potential is piece-wise continuous and finite, it follows from
the properties of the differential equation (3.72), that the wave function φ (x)

and its first derivative   are continuous everywhere, in particular at x = 0.
 dx 
Therefore, one has
a+ + a– = b+ + b– (3.77)

q
a+ – a– = (b − b− ) (3.78)
p +
The solutions are discussed separately for the two qualitatively different
cases, (i) E ≥ V, and (ii) E < V.
Case (i) For E ≥ V, it is assumed that the particle approaches the barrier from
the left and is either transmitted or reflected at x = 0. Hence, for x > 0, there is
only a wave function describing a particle moving to the right which implies that
b– = 0 (3.79)
Therefore Eqs. (3.77) and (3.78) give

 2p 
b+ =  a+
 p + q 

 p − q
a– =  a+ (3.80)
 p + q 
80 Elements of Modern Physics

In analogy with the transmission and reflection of classical electromagnetic


waves, transmission and reflection coefficients can be defined as:
2
q b+ 4 pq
T= =
p a+ ( p + q)2

2 2
a− p − q
R= a =  (3.81)
+  + q 
p
In terms of the refractive index n,

p
n= (3.82)
q
Eqs. (3.81) can be written as

4n
T=
( n + 1) 2

2
n − 1
R =  (3.83)
 + 1 
n
which are the same as the classical transmission and reflection coefficients for
electromagnetic waves.
Case (ii) For E < V, q is imaginary. Writing
q = iα (3.84)

1
where α= [2m (V – E)]1/2 (3.85)

the solution for x > 0, is
φr (x) = b+ e–αx + b – eαx (3.86)
In order to keep the probability finite as x → ∞,
b– = 0 (3.87)
The continuity equations (3.77) and (3.78) then imply that

 2p 
b+ =  a+
 p + iα 

 p − iα 
a– =  a+ (3.88)
 p + iα 
Elements of Quantum Theory 81

This means that |a–| = |a+| and


φ (x) = 2a+ e–iδ cos (px + δ), x < 0 (3.89)
where δ = tan (α/p)
–1
(3.90)
i.e. φ (x) represents a standing wave.
There are several significant points about the results, that should be noted:

∫ |ψ|
2
1. Since dx = ∞, the wave function is not normalizable.
–∞

However, it could be used to describe a beam of particles with a density of


|a+|2 per unit length, moving to the right, which get either transmitted or
reflected at x = 0.
2. For E > V, i.e. case (i), the rate of flow of the incoming particles must be
equal to the sum of the rates of flow of transmitted and reflected particles.
Since the momenta of these particles are p, q and – p respectively, one
has the condition
p|a+|2 = q |b+|2 + p|a–|2 (3.91)
which is equivalent to the requirement that T + R = 1 and is easily seen to
be satisfied by the relations is Eq. (3.83).
3. For E < V, i.e. case (ii), a standing wave is obtained for x < 0, which
means that all the incoming particles are reflected. However, the wave
function is nonzero for x > 0 though it vanishes exponentially as x → ∞.
Thus, there is a finite probability of finding the particles in the region x > 0
which is a forbidden region in classical mechanics. This is called barrier
penetration and has no classical analogue in the mechanics of particles
(such a phenomenon was observed in optics by Newton). It does not
however lead to a paradox since localization or observation of the particle
in the classically forbidden region involves a change in the momentum and
the energy of the particle which may then have sufficient energy to make
this a classically allowed region.
Electrons in a metal are a case with E < V but when metal is heated it
satisfies condition E > V and hence we see thermionic emission. This is
used in cathode ray tubes.
4. A special case of interest is the one of V → ∞ in case (ii) for which
α → ∞ and the wave function is given by
φ(x) = 2ia+ sin (px), for x < 0 (3.92)
φr (x) = 0, for x ≥ 0 (3.93)
82 Elements of Modern Physics

This wave function could have been obtained from Eqs. (3.73) and (3.86)
with b– = 0, by just requiring that the wave function vanishes at x = 0,
φ (0) = φr (0) = 0, but no further conditions on dφ/dx. Indeed this prescription
considerably simplifies that calculations whenever the potentials jump to
infinity.
5. The quantum mechanical phenomenon of a particle penetrating classically
forbidden barriers gives rise to an interesting observation of trapped
particles escaping through classically forbidden barriers. Consider a situation
[Fig. 3.2(b)] where the barrier exists only for 0 ≤ x ≤ d, i.e. V (x) = V for
0 ≤ x ≤ d and V (x) = 0 elsewhere. In this case, if a beam of particles, is
incident from the left with energy E < V, the wave function is given by
φ(x) = a+eipx + a–e–ipx for x < 0,
= b+e–αx + b–eαx for 0 ≤ x ≤ d, (3.94)
= c + e ipx for x > d
where p and α are given in Eqs. (3.74) and (3.85) respectively. Continuity
of the wave function and its derivative at x = 0 and x = d, allows us to
determine a–, b+, b– and c+ in terms of a+. Since the particles can penetrate
the forbidden barrier, in general b+, b– and c+ are nonzero. Thus, the particles
can cross a barrier even if classically, the energy is insufficient to pass
over the barrier, and the probability of transmission is given by the ratio
|c+|2/|a+|2. This effect is termed as tunnelling and provides a satisfactory
explanation for the decay of unstable particles (e.g. U235, etc.) as a
tunnelling of trapped particles through a potential barrier.
A scanning tunneling microscope (STM) is an instrument for imaging
surfaces at the atomic level. It is based on the concept of quantum
tunneling. When a conducting tip is brought very near to the surface to
be examined, a bias (voltage difference) applied between the two can
allow electrons to tunnel through the vacuum between them. The resulting
tunneling current is a function of tip position, applied voltage, and the
local density of states (LDOS) of the sample. Information is acquired by
monitoring the current as the tip’s position scans across the surface. For
an STM, good resolution is considered to be 0.1 nm lateral resolution and
0.01 nm depth resolution. With this resolution, individual atoms within
materials are routinely imaged and manipulated. (Source : Wikipedia)
Elements of Quantum Theory 83

Instrumentation
Control voltages for piezotube

Piezoelectric tube
with electrodes
Tunneling Distance control
current amplifier and scanning unit

Tip

Tunnelling
voltage
Data processing
and display

Fig. 3.3

Fig. 3.4 Image of reconstruction on Fig. 3.5 An image of single-walled


a clean Gold surface carbon nanotube

3.8 PARTICLE IN A BOX


An example of a particle in a potential which allows only discrete values for the
energy of the particle is now described.
Consider a particle in a one-dimensional potential which is zero for
0 ≤ x ≤ l, and is infinite for x < 0 or x > l (see Fig. 3.3). The Schrödinger equation
for the particle with well-defined energy is
2
 2 d φ ( x)
− = Eφ (x), 0 ≤ x ≤ l (3.95)
2m dx 2
whose solutions can be written as
84 Elements of Modern Physics

φ(x) = a sin (px + α) (3.96)

1
p= (2mE)1/2 (3.97)

The wave function is zero for x < 0 or x > l since the potential is infinite in
this region. Since the potential jumps to infinity at x = 0 and x = l, the boundary
conditions as discussed in Sec. 3.7 are that the wave function should vanish at
x = 0 and x = l. This implies that
sin α = 0, (3.98)
pl = nπ, n = 1, 2, ... (3.99)
Therefore, the solutions are
1/ 2
2 nπ
φn(x) =   sin  x  , n = 1, 2, ... (3.100)
l  l 

 2 π2 2
En = n (3.101)
2ml 2
where a = (2/l)1/2 has been used as required by the normalization condition in
Eq. (3.19)

l
2 nπ 
| a |2 ∫ sin
0
 l x  dx = 1
 
(3.102)

Some of the significant points to note are as follows:


1. A state with n = 0 is not acceptable since this would correspond to a state
which is zero everywhere. The lowest energy state, called the ground
state, therefore corresponds to n = 1, and has a nonzero energy.

Fig. 3.6 Energy levels and wave functions (dashed lines)


for a particle inside a box.
Elements of Quantum Theory 85

2. The discreteness of the energy levels is significant only for small m and l.
For example, if m ≈ 10–3 kg and l ≈ 0.1 m, the separation between the
energy levels is of the order of 10–62 J which is quite negligible. On the
other hand, for an electron in an atom, m ≈ 10–30 kg and l ≈ 10–10 m, so that
∆En ~ (60n) eV and the discreteness becomes important.
3. It is seen that the states with n ≥ 2 have nodes inside the box. Since
probability density is given by |φn(x)|2, this means that there are some
regions where the particle will not be found, which is totally incompatible
with the classical ideas of trajectories.
4. If the potential in the region x < 0 and x > l, is not infinite but finite, the
wave function can penetrate into this region. Therefore, it is not forced to
be zero at x = 0 or x = l. Hence, the wave function varies more gently
inside the region 0 ≤ x ≤ l, and the energies, which are related to the
second derivative of the wave function are lower than in the case where
the potential is infinite for x < 0 and x > l.
5. It is observed that
l

∫φ
0
n ( x) φn ' , ( x) dx = 0, for n ≠ n′ (3.103)

∫φ
0
n ( x) φn ( x) dx = 1 (3.104)

which can together be written as


l

∫φ 0
n ( x) φn ' ( x) dx = δn, n′ (3.105)

where the Kronecker delta δn,n′ is 1 for n = n′ and zero otherwise. Thus,
these states are orthonormal. It can also be shown that any general state
of a particle in the box can be written as a linear combination of the
energy eigenstates, i.e.

ψ (x) = ∑ a φ ( x)
n =1
n n

∑| a
n =1
n |2 = 1 (3.106)

which means that the eigenstates φn (x) are complete. The orthonormality
and completeness are important properties associated with the eigenstates
of any physical observable (see Sec. 3.4).
86 Elements of Modern Physics

Due to significant progress in semiconductor technology, now we have


systems known as quantum dots and quantum wells which are particles confined
in a small volume.
A quantum dot is a portion of matter (e.g. semiconductor) whose excitons
(electron-hole pair) are confined in all three spatial dimensions. As a result, such
materials have_electronic properties intermediate between those of bulk
semiconductors and those of discrete molecules.
Quantum dot technology is a good candidate for use in solid-state quantum
computation. By applying small voltages to the leads, the flow of electrons through
the quantum dot can be controlled and thereby precise measurements of the
spin and other properties therein can be made. With several entangled quantum
dots or qubits, plus a way of performing operations, quantum calculations and
the computers that would perform them might be possible.
A quantum well is a potential well with only discrete energy values. One
way to create quantization is to confine particles, which were originally free to
move in three dimensions, to two dimensions, forcing them to occupy a planar
region. The effects of quantum confinement take place when the quantum well
thickness becomes comparable to the de Broglie wavelength of the carriers
(generally electrons and holes), leading to energy levels called “energy bands”,
i.e., the carriers can only have discrete energy values.

Fig. 3.7 Researchers at Los Alamos National Laboratory have developed a


wireless device that efficiently produces visible light, through energy
transfer from thin layers of quantum wells to crystals above the layers.
Quantum wells are in wide use in diode lasers, including red lasers for
DVDs and laser pointers, infra-red lasers in fiber optic transmitters, or in blue
lasers. They are also used to make HEMTs (High Electron Mobility Transistors),
which are used in low-noise electronics. Quantum well infra-red photodetectors
are also based on quantum wells, and are used for infrared imaging.
Elements of Quantum Theory 87

A quantum well laser is a laser diode in which the active region of the
device is so narrow that quantum confinement occurs. The wavelength of the
light emitted by a quantum well laser is determined by the width of the active
region rather than just the bandgap of the material from which it is constructed.
This means that much shorter wavelengths can be obtained from quantum well
lasers than from conventional laser diodes using a particular semiconductor
material. The efficiency of a quantum well laser is also greater than a conventional
laser diode.
(Source : Wikipedia)

3.9 SIMPLE HARMONIC OSCILLATOR


A problem of considerable importance is that of the simple harmonic oscillator.
For most oscillating systems, it can be used as the first approximation.
For a state of well-defined energy, the time independent Schrödinger equation
for a simple harmonic oscillator in 1-dimension, is
2
 2 d φ( x) 1 2
− + kx φ ( x) = Eφ (x) (3.107)
2m dx 2 2
For obtaining the solutions to this equation, the asymptotic behaviour of
φ(x) ~ exp (– α2x2/2) where α4 = mk/  2 , is first separated out by writing
φ (x) = exp (– α2x2/2) η(x) (3.108)
where η(x) satisfies the equation

d 2η ( x) d η ( x)  2 2mE 
− 2α 2 x =  α − 2  η ( x) (3.109)
dx 2 dx   
Substitution of a series solution for η(x) into Eq. (3.109) and equating the
coefficients of xk gives

η(x) = ∑b
k =0
k xk

 2 2 2mE 
(k + 2) (k + 1) bk + 2 =  2α k + α − 2  bk (3.110)
  
For a general value of E, the infinite series for n (x) gives an asymptotically
increasing solution φ (x) ~ exp (α2x2/2) which is not normalizable. However, for
α 2 2
some special values of E = (n + 1/2) , n a positive integer, the series in
m
Eq. (3.110) terminates at k = n and we get normalizable wave functions in
88 Elements of Modern Physics

terms of Hermite polynomials Hn of order n. The wave functions and their


energies are

1/ 2
 α 
φn (x) = π−1/ 4  n  exp (– α2x2/2) Hn (αx)
 2 n! 

 1
En =  n +  ω , n = 0, 1, 2, ... (3.111)
 2
where ω = (k/m)1/2. It is observed that again the ground state energy is not zero.
1
Its value of ω is called the zero-point energy.
2
The first three Hermite polynomials are
H0 (αx) = 1
H1 (αx) = 2αx (3.112)
H2 (αx) = 4α2x2 – 2
The harmonic oscillator problem can be solved more elegantly by using
operator algebra. Defining
1/ 2
a =  mω  x+  ∂ 

 2   hω ∂x 

1/ 2
 mω  x−  ∂ 
a† =    mω ∂x 
(3.113)
 2  
it can easily be shown that
aa † − a † a = 1

aH – Ha = ωa (3.114)

a † H − Ha † = −ωa +
with the Hamiltonian H (i.e. energy) being

2 d 2 1 2
H= − + kx (3.115)
2m dx 2 2
Therefore if ψ0 (x) is the ground state with energy E0, then using Eq. (3.114)

Haψ0 (x) = ( E0 − ω) aψ 0 ( x ) (3.116)

Ha † ψ0 (x) = ( E0 + ω) a † ψ 0 ( x ) (3.117)


Elements of Quantum Theory 89

Since ψ0 (x) is the ground state, this implies that


aψ0 (x) = 0 (3.118)
and a † ψ 0 ( x ) is an eigenstate with energy E0 + ω . It is easy to solve for
ψ0 (x) from Eqs. (3.118) and (3.113) which give
ψ0 (x) = π–1/4 α1/2 exp (–α2x2/2) (3.119)
for the normalized ground state wave function, which with the help of Eq. (3.107)
1
can be shown to have an energy ω . The eigenstates with energies E0 + nω
2
can be developed with the repeated operation of a † as in Eq. (3.117).
1
ψn (x) = 1/ 2
( a †) n ψ 0 ( x), n ≥ 0
( n !)

 1
En =  n +  ω (3.120)
 2
The normalization is obtained by the repeated use of the first relation in
Eq. (3.114) and Eq. (3.118). It is clear from the definition of a † , tht ψn (x) are
alternatively even and odd functions of x.

3.10 SMALL PERTURBATIONS


It is often the situation that a realistic problem we encounter cannot be solved
exactly but differs slightly from a solvable problem. If the difference is small,
approximate solutions can be obtained by using perturbation theory.
Consider a situation where the solutions to the eigenvalue equation for the
energy are known,
H 0φ n = E n φ n (3.121)
where φn are normalized, but the solution to the following equation is to be
found,
(H0 + λV)ψ = Eψ (3.122)
where λ is small and E is close to En. Multiplying Eq. (3.122) by φ and integrating
n
*

over the space τ,


∫φn* (H0 + λV) ψ dτ = E∫ φn* ψ dτ (3.123)
One integrating the first term by parts, the Hamiltonian H0 operates on φn*
giving Enφn* (H0 is real), which then leads to

∫ φ*n V ψ d τ
E = En + λ (3.124)
∫ φ*n ψ d τ
90 Elements of Modern Physics

This is an exact relation. An approximate expression for E can be obtained


by noting that the second term is small (λ is small), and therefore ψ can be
replaced by φn in this term, which leads to
E ≈ En + λ ∫ φn* V φn dτ (3.125)
For obtaining more accurate expressions for E, better approximations for
ψ should be used.
The expression in Eq. (3.125) is valid provided φn is an isolated state. If
there are more than one degenerate states corresponding to the energy level En,
say φn(i) (i = 1, 2, ...) which are orthonormal, ψ may be approximated by a linear
combination of the degenerate states. Writing

ψ= ∑aφ
i =1
i
(i )
n (3.126)

where ∑| a |
i
i
2
= 1, Eq. (3.124) gives

∑ (λ ∫ φ
i
( j )*
n V φ(ni ) d τ) a j = (E – E ) ai
n
(3.127)

Thus, (E – En) and ai are obtained from this set of equations. For example,
if the degeneracy is of order two, i.e. i = 1, 2, the solutions are
E = En + λ (V11 + xV12) (3.128)
where x = a2/a1 is

V22 − V11 ± [(V22 − V11 )2 + 4V12 V21 ]1/ 2


x= (3.129)
2V12

Vij = ∫ φn(i)* V φn(j) dτ (3.130)


The degenerate states in this case split into two levels corresponding to the
two states given by the two values of x.

3.11 ANGULAR MOMENTUM


In the discussion so far, emphasis has been on the energy and momentum of the
particle. However, for a particle in the presence of a rotationally invariant
3-dimensional potential, the angular momentum of the particle plays an important
role. The operator corresponding to the angular momentum observable has some
interesting properties which are discussed briefly.
The angular momentum of a particle is given by
L= r×p (3.131)
Elements of Quantum Theory 91

while its square is given by


L2 = (r × p) ⋅ (r × p) (3.132)
Here, since r and p are operators, their order must be maintained. The
expression for L2 simplifies to
L2 = r. (p × (r × p)) (3.133)

= ∑ [r p
i, j
i j ri p j − rii j rj pi ]


and with r.p = −ir Eq. (3.133) becomes
∂r

∂ ∂2
L2 = r2p2 + 2 2 r + 2r 2 2 (3.134)
∂r ∂r
Using this relation the kinetic energy T can be written as

1 2
T= p
2m

2 ∂ 2 ∂ L2
= − 2 ∂r
r + (3.135)
2mr ∂r 2mr 2
which shows that the angular momentum is an important term in the kinetic
energy.
The expressions for the angular momentum operators in terms of spherical
coordinates are obtained from Eq. (3.131) as

∂ ∂
Lx = i  sin φ + cos φ cot θ 
 ∂θ ∂φ 

∂ ∂
L y = − i  cos φ − sin φ cot θ  (3.136)
 ∂θ ∂φ 

Lz = − i ∂
∂φ

 2 
and L2 = − i  1 ∂  sin θ ∂  + 12 ∂ 2  (3.137)
 sin θ ∂θ  ∂θ  sin θ ∂φ 
The wave functions corresponding to well-defined values of L2, satisfy the
equation

 1 ∂  ∂ 1 ∂2 
− i   sin θ  + Y (θ, φ) = λY (θ, φ) (3.138)
 sin θ ∂θ  ∂θ  sin 2 θ ∂φ2 
92 Elements of Modern Physics

As before, factorizable form for the solution is assumed,


Y (θ, φ) = P (θ) F (φ) (3.139)
Substituting this in Eq. (3.138) and multiplying by (sin2 θ)/  2Y (θ, φ), gives

1 ∂2F sin θ ∂  ∂ λ
− sin θ P (θ)  + 2 sin 2 θ
F (φ) ∂φ2 = P (θ) ∂θ  ∂θ  
(3.140)

Since the two sides depend on different variables, each must be a constant,
say m2, so that

d 2 F (φ)
2
+ m 2 F (φ) = 0 (3.141)

1 d  dP (θ)  m2 λ
 sin θ  − 2 P(θ) + 2 P(θ) = 0 (3.142)
sin θ d θ  d θ  sin θ 
The solutions to the Eq. (3.141) are
F(φ) = e im φ (3.143)
However, if the condition that the wave function at every physical point
must be single-valued is imposed, then F(φ) = F(φ + 2π) which means that the
values of m are restricted to m = 0, ± 1, ±2 etc. It is easily seen that


−i F (φ) = m F (φ), m = 0, ± 1, ± 2, ... (3.144)
∂φ

which implies that the state is an eigenstate of Lz with eigenvalue m . For


obtaining solutions to Eq. (3.142), the substitution, v = cos θ is first used. Then,
the equation for m = 0 reduces to Legendre’s differential equation. For m = 0,
substitution of a series solution for P(θ) into Eq. (3.142) and equating the
coefficients of similar terms, gives
P (θ) = ∑ bk vk, v = cos θ (3.145)

 2 λ 
(k + 1) (k + 2) bk + 2 =  k + k − 2  bk
  
For an arbitrary value of λ, the series diverges at v = ± 1. However, for
λ = l (l + 1)  2 , l a positive integer, the series terminates at k = l and we get
well-behaved solutions:

λ = l (l + 1)  2 , l = 0, 1, 2, ... (3.146)

1 dl 2
Pl (v) = (v − 1)l , v = cos θ
2l l ! dvl
Elements of Quantum Theory 93

These are Legendre Polynomials of order l, the first few of them being
P0 (cos θ) = 1
P1 (cos θ) = cos θ (3.147)

1
P2 (cos θ) = (3 cos 2 θ − 1)
2
The solutions for m ≠ 0 are somewhat more complicated, and for l ≥ m ≥ 0,
are given by

dm
2 m/2
Plm (v) = (1 − v ) Pi (v) , v = cos θ (3.148)
dv m
called the associated Legendre functions. Combining these solutions with those
in Eq. (3.143) gives the solutions to Eq. (3.138) as
1/ 2
 (2l + 1) (l − m)!
Y (θ, φ) = 
m (–1) m eimφ Pl m (cos θ) (3.149)
l
 4π (l + m)!
with λ = l (l + 1)  2 , l and m being integers, and l ≥ m. Ylm (θ, φ) are called
spherical harmonics, and are defined for negative integers m by the relation
Ylm = (–1)m (Yl–m)* (3.150)
Their normalization is chosen such that they are orthonormal,
∫ Yl m* (θ, φ) Yl m' ' (θ, φ) d cos θ d φ = δl ,l ' δ m ,m ' (3.151)
They are simultaneous eigenfunctions of Lz and L since 2

m
Lz Ylm (θ, φ) = m Yl (θ, φ), m ≤ l

L2 Ylm (θ, φ) = l (l + 1)  2 Yl m (θ, φ) (3.152)


It is easy to show that they satisfy the important property

Yl m (π − θ, φ + π) = (– 1)l Yl m (θ, φ) (3.153)


i.e. for r → –r, they are even for even l and odd for odd l. The first few of
these functions are:
1
Y00 (θ, φ) =
(4π)1/ 2
1/ 2
3
Y10 (θ, φ) =   cos θ (3.154)
 4π 
1/ 2
3
Y1±1 (θ, φ) = ∓   exp (± iφ) sin θ
 8π 
94 Elements of Modern Physics

Apart from playing an important role in the discussion of the kinetic energy
in spherical coordinates [Eq. (3.135)], the angular momentum operator plays a
significant role in determining the rotational energy levels of a rigid rotator. For
example, the rotational energy levels of a di-atomic molecule are given by the
Hamiltonian

1 2
H= L (3.155)
2I
where l is the moment of inertia, and the corresponding energy levels are given
by

1
E= l (l + 1) 2 , l = 0, 1, ... (3.156)
2I

3.12 EXAMPLES
Here some important properties of quantum mechanical system and their
applications are discussed.

Example 1
Since indeterminacy is not an essential part of classical mechanics, it is suggetive
that classical measurements may be related to the averages of quantum
mechanical measurements. This is illustrated by Ehrenfest’s theorem.
Consider the time-derivative of the average position given by

d d
〈 r〉 = ∫ ψ* rψ d 3 r
dt dt

*
∂ψ 3  ∂ψ 
∫ ψ* r ∫
3
= d r+   rψd r (3.157)
∂t  ∂t 
Using the Schrödinger equation, and cancelling the potential energy terms
(potential is real),

d i  * 2
〈r 〉 = ∫ ψ r∇ ψ d 3 r − ∫ (∇ 2 ψ* ) r ψ d 3 r  (3.158)
dt 2m 
Integrating by parts gives
d 1
dt
〈r 〉 =
m ∫
ψ * ( − i ∇ ) ψ d 3 r

〈p〉
= (3.159)
m
Elements of Quantum Theory 95

which is analogous to the classical result that the momentum is the product of
mass and velocity. Proceeding in a similar way, it can be shown that

d 〈 p〉  * ∂ψ 3  ∂ψ*  3 
= −i  ∫ ψ ∇ d r+∫  ∇ψd r  (3.160)
dt  ∂t  ∂t  
which on using the Schrödinger equation once again and integrating by parts,
leads to

d 〈 p〉 3
= − ∫ ψ *(∇V )ψ d r
dt
= 〈−∇V 〉 (3.161)
This relation is analogous to Newton’s second law in classical mechanics.
It is to be noted that if the uncertainties in the values of the various dynamical
quantities can be neglected, Eqs. (3.159) and (3.161) represent the classical
behaviour of particles in terms of approximate trajectories.

Example 2
The angular momentum operators provide an interesting illustration of the
properties of hermitian operators. It follows from Eq. 3.131) or from Eqs. (3.136)
and (3.137) that the angular momentum operators satisfy the commutation
properties
[Lx, L2] = [Ly, L2] = [Lz, L2] = 0 (3.162)
but

[Lx, Ly] = iLz

[Ly, Lz] = iLx

[Lz, Lx] = iLy (3.163)


Thus, functions that are simultaneous eigenstates of L2 and Lx, or L2 and Ly,
or L2 and Lz, but not of Lx and Ly, and Lz, or Lz and Lx can exist. For example, Ylm
(θ, φ) are simultaneous eigenstates of L 2 and L z but not of L x or L y
(except for l = 0). However, it is seen that L± = Lx ± iLy have the useful property

LzL± = L± Lz ± L± (3.164)


If this equation operates on Ylm (θ, φ), then
Lz [L± Ylm (θ, φ)] = (m ± 1)  [L± Ylm (θ, φ)] (3.165)
96 Elements of Modern Physics

which means that operation by L± produces eigenstates with eigenvalues


(m ± 1)  , or annihilates that state if states with eigenvalues (m ± 1)  do not
exist. Hence L± are called raising and lowering operators.
It may also be noted that for potentials which are functions of the radial
distance r only,
[H, L] = 0 (3.166)
so that it is possible to obtain eigenstates of H which are simultaneously
eigenstates of L2 and Lz, or L2 and Lx, or L2 and Ly.

Example 3
Consider the tunnelling of particles across a barrier potential of height V and
with d. Imposing the conditions of continuity of the wave function in Eq. (3.94),
and its derivative at x = 0 and x = d,
a+ + a– = b+ + b–


a+ – a_ = (b+ – b_)
p

b+ e–αd + b–eαd = c + e ipd (3.167)

ip
b+ e–αd – b_ eαd = − c + eipd (3.167)
α
Expressing c+ in terms of a+, we get

a+ 1 ipd  iα   ip  αd  iα   ip  −αd 
e  1 +   1 −  e +  1 −   1 +  e 
c+ = 4  p  α  p  α 
(3.168)

so that the transmission coefficient T is given by


2 2
a+
c+ =
1
T
=
1
16 ( α p
)
2 eαd + e−αd + i  −  eαd − e−αd
 p α
( )
2
1 α p
(e )
αd 2
= 1+ + − e −αd (3.169)
16  p α 
For a very broad barrier, i.e. for large d, this leads to

16α 2 p 2 −2αd
T≈ e (3.170)
( α2 + p2)
Elements of Quantum Theory 97

Example 4
Bohr’s correspondence principle states that a quantum system tends (in a
particular sense) to its classical analogue, for large quantum numbers. This
is demonstrated for a particle in a box.
For a particle in a one-dimensional box of length l, the probability of finding it in
the region B ≤ x ≤ B + b, (see Sec. 3.8), is
B+b
2 nπ
Pb = ∫ sin 2  x  dx
l  l 
B

B+b
b 1  2nπx  →
b
for n → ∞
= − sin (3.171)
l 2nπ  l  B l

which is the value expected for a classical system.

Example 5
As an application of perturbation theory, consider a particle of charge q, in the
presence of a constant electric field E, inside a 3- dimensional box of dimensions
(lx) × (ly) × (lz).
In the absence of the electric field, the wave functions and the energies are
[see Eqs. (3.100) and (3.101)]
φn (x, y, z) = φnx (x) φny (y) φnz (z) (3.172)

 2 π2  nx2 n 2y nz2 
En =  2 + 2 + 2  , nx = 1, 2, etc. (3.173)
2m  lx l y lz 

1/ 2
2 n π 
where φnx (x) =   sin  x x  , 0 ≤ x ≤ lx (3.174)
 lx   lx 
= 0 for x 〈 0 or x 〉 lx
and similar expressions for φny (y) and φnz (z). If a weak constant electric
field E is introduced, the additional potential energy is
V= –qE⋅r (3.175)
The change in the energy due to this term is given by perturbation theory
(see Sec. 3.10) as
E – En ≈ – q ∫ |φn (x, y, z)|2 E ⋅ r dτ (3.176)
1
=– q (Ex lx + Ey ly + Ez lz) (3.177)
2
98 Elements of Modern Physics

Thus, to the leading order, all the energy levels are shifted by the same
amount.

Example 6
For the 3-dimensional harmonic oscillator, the Hamiltonian

1 2 1 2
H= p + kr (3.178)
2m 2
is the sum of 1-dimensional Hamiltonians in the three directions. Hence the
wave function is a product of the three wave functions,
ψ (x, y, z) = ψnx (x) ψny (y) ψnz (z) (3.179)
where ψnx (x), etc. are given in Eq. (3.111), and the energy is

 3
E =  nx + n y + nz +  ω (3.180)
 2
These solutions can be written in terms of spherical coordinates so as to
exhibit the angular momentum content of the states. For example, the ground
state is

2α3/ 2 0
ψ (r, θ, φ) = 1/ 4
Y0 (θ, φ) exp (– α 2 r 2 / 2)
π

3
E= ω (3.181)
2
while the state with nx = 1, ny = nz = 0 can be written as

2α 5 / 2
ψ (r, θ, φ) = 1/ 4 1/ 2
[Y1−1 (θ, φ) − Y11 (θ, φ)] r exp (−α 2 r 2 / 2)
π 3

5
E= ω (3.182)
2

PROBLEMS
1. A one-dimensional wave packet has the form ψ(x) = 0 for |x| > a, and
ψ(x) = (2a)–1/2 for |x| ≤ a. What is the wave function in the momentum
space? Demonstrate the uncertainty principle for this wave packet.
2. For a particle coming from the left with energy E, and a potential changing
from V (x) = 0 for x < 0 to V (x) = – V0 for x ≥ 0, obtain the transmission
and reflection coefficients T and R respectively. Show that R + T = 1.
Elements of Quantum Theory 99

3. Show that the frequency of radiation emitted when the particle inside a
one-dimensional box undergoes a transition from (n + 1) state to n state,
tends to the classical frequency of motion inside the box, for n → ∞. This
is another illustration of Bohr’s correspondence principle.
4. Consider a wave function Ae–r/a for the ground state of the hydrogen
atom, r being the separation between the electron and the proton. Determine
A, a, and the ground state energy. What are the classical and quantum
mechanical probabilities of finding the electron at a separation greater
than 2a?
5. If a three-dimensional harmonic oscillator has a solution of the form AY10
(θ, φ) re–ar2, determine a, A, and the energy in terms of mass and force-
constant of the oscillator.
6. For a particle inside a one-dimensional square well defined by V (x) = 0
for |x| ≥ a and V(x) = –V0 for |x| < a, obtain a relationship between
the binding energy, a, and V0. Show that for V0 → 0, there is a bound state
with energy E → –2ma2V02/  2 (this is a shallow bound state in the sense
that E/V0 → 0 as V0 → 0).
7. For a particle in a one-dimensional box, obtain the standard deviations
σ(x) and σ(p) for position and momentum respectively. Show that σ(x)
1/ 2
 n 2 π2 1 
σ(p) =   −  , and that it is greater than  / 2 .
 12 2

8. For a particle in a one-dimensional potential well defined by V (x) = ∞ for


x < 0, V (x) = –V0 for 0 ≤ x ≤ a and V (x) = 0 for x > a, obtain a relation
between the binding energy, a, and V0. Show that these states are the
same as the odd states in Problem 6.
9. Show that pz and Lz are hermitian operators. Show that the commutator
[pz, Lz] = 0 and [py, Lz] = ip x . Also show that the wave function in
Problem 5 is an eigenstate of Lz but not of pz or px. Calculate the expectation
values of Lz, pz, px and p2 for this wave function.
10. Obtain the first order perturbation to the energy of the ground state for a
one-dimensional harmonic oscillator in the presence of a perturbing potential
λx 4 .
11. Obtain the first order perturbations to the energy of the ground state and
the first excited states for a particle in a cubic box with the centre at the
origin and the edges parallel to the coordinate axes, due to a perturbing
potential λxy.
4
The One-Electron Atom

Structures of the Chapter


4.1 Solutions of the Schrödinger equation
4.2 Electron spin
4.3 Total angular momentum
4.4 Fine structure of one-electron atomic spectra
4.5 Hyperfine structure
4.6 Examples of one-electron atoms
1
4.7 Schrödinger equation for spin particles
2
4.8 Dirac equation
4.9 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 101
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_4
102 Elements of Modern Physics

In this chapter, the one-electron atom is analysed within the framework of wave
mechanics. It is the ability of quantum mechanics to describe the detailed
properties of the one-electron atom which has, more than any thing else,
established the essential validity of quantum mechanical ideas, at least as a
calculational tool for describing small-distance phenomena.
The wave functions and the energy levels of the nonrelativistic one-electron
atom are first obtained. The corrections due to spin-orbit interaction and other
relativistic effects are then introduced perturbatively. Together, these results
provide a very satisfactory description of the one-electron energy levels including
the fine structure. Finally, the effect of the nuclear spin on the atomic energy
1
levels is discussed and a brief introduction to the formal description of spin-
2
particles is given.

4.1 SOLUTIONS OF THE SCHRÖDINGER EQUATION


The total energy of an electron and a nucleus of charge Ze, is

1 1 Ze 2
E = mere2 + mnrn2 – (4.1)
2 2 4πε 0 | re − rn |
In the centre of mass frame defined by Eq. (2.54), Eq. (4.1) has the form

p3 Ze 2
E = 2mr − 4πε r (4.2)
0

where r = re – rn (4.3)
p = mr r (4.4)

me mn
mr = (4.5)
me + mn
The Schrödinger equation follows from Eq. (4.2). For states with well-
defined energy E, one can write the wave function in the from
ψ (r, t) = φ (r) exp (–iEt/  ) (4.6)
with φ (r) satisfying the time-independent Schrödinger equation

2 2 Ze 2
– 2m ∇ φ(r ) − 4πε r φ (r ) = Eφ (r) (4.7)
r 0

where the first term represents the kinetic energy. Because the potential is a
function only of r, it is preferable to write the Laplacian operator in terms of
spherical coordinates. It is particularly convenient to use the expression in
Eq. (3.135) in terms of which the Schrödinger equation becomes
The One-Electron Atom 103

2 ∂ 2 ∂ L2 2
− 2 ∂r
r φ(r ) + φ(r ) − Ze φ (r ) = Eφ (r)
2mr r ∂r 2mr r 2
4πε 0 r
(4.8)
with the angular nominatum term L2 given by Eq. (3.137).
The solutions to Eq. (4.8) in the factorizable form can be written as
φ (r) = (r) Y (θ, φ) (4.9)
Dividing Eq. (4.8) to φ (r) leads to

1  2 d 2 d Ze2 
 r − − E  R(r )
R(r )  2mr r dr
2 dr 4πε0 r 

1
2
L2Y (θ, φ) (4.10)
2mr r Y (θ, φ)
which implies that once the left hand side is independent of θ and φ, Y (θ, φ)
must satisfy the eigenvalue equation
L2Y(θ, φ) = λY (θ, φ) (4.11)
The solutions to this equation were discussed in Sec. 3.11, and are the
spherical harmonics Ylm (θ, φ) given in Eq. (3.149), which satisfy the equations:
m
LzYlm (θ, φ) = m Yl (θ, φ) , m = 0, ± 1, ..., ± l

L2Ylm(θ, φ) = l (l + 1)  2 Ylm (θ, φ), l = 0, 1, ... (4.12)


These solutions Y (θ, φ) reduce the radial equation to
l
m

2  1 d 2 d l (l + 1)  Ze2
− r
 r dr dr + R ( r ) − R(r ) = ER(r) (4.13)
2mr 2
r 
2 4πε0 r
There are two important classes of solutions to the radial equation. It is
found that solutions exist for all positive values of E. They exhibit oscillatory
behaviour for r → ∞ and are not normalizable. These solutions can be used to
describe a beam of particles scattered by the Coulomb potential, and lead to
Rutherford scattering. The solutions which are of greater interest are the ones
for negative E which correspond to bound state solutions. The steps followed in
obtaining the negative energy solutions are as follows:
1. Obtain the asymptotic behaviour of R(r), which is finite for r → ∞.
It is

R (r) → exp [– (–2mrE/  2 )1/2r] (4.14)


104 Elements of Modern Physics

2. Define

u (r) = R (r) exp [(–2mrE/  2 )1/2r] (4.15)


and consider a series solution for u (r),

u (r) = ∑b r
k=s
k
k
, k = s, s + 1, ..., bs ≠ 0 (4.16)

3. Impose the condition that the solution for u(r) does not alter the asymptotic
behaviour of R(r). This constraint on the asymptotic behaviour leads to
the result that the series in Eq. (4.16) must terminate for a finite value of
k, say k = s + p, p is an integer, i.e. bk = 0 for k > (s + p).
Substituting the above expressions in Eq. (4.13), and equating the coefficients
of the same powers of r, in particular of rk–2, gives

2  Ze 2  2 k  2mr E  1/ 2

bk 2m [l (l + 1) − k ( k + 1)] = bk −1  − − 2   (4.17)
4πε0 mr   
r
 
Since bs–1 = 0, we have s = l or s = – l – 1. For l ≠ 0, the s = – l – 1 solutions
are not normalizable and hence are discarded. For l = 0, the first term in
Eq. (4.16) for the s = –l –1 solution, is b–1 r–1. However, Eq. (4.17) for k = 0
gives b–1 = 0 which is inconsistent. Hence, only the s = l solution need be
considered. The requirement that the series terminates, i.e.
bk = 0 for k = l + p + 1, p ≥ 0, then leads to
1/ 2
 mr Ze 2   2mr E 
 2  = (l + p + 1) 
− 2  (4.18)
 4πε 0    

Thus the negative-energy solutions exist only for energies


2
mr  Ze 2  1
En = –  4πε  2 , n = 1, 2, ... (4.19)
2 2  0  n

n = l + p + 1, p = 0, 1, ...
with l + p being the highest power of r in the series solution for u(r). The wave
functions corresponding to the solutions are related to Laguerre polynomials,
and are given by
1/ 2
 
 2 Z  ( n − l − 1)! 
3

Rn,l (r) =   ρl e − p / 2 p L2nl++l1 (ρ) (4.20)


 na1  2n ( n + l )! 3 
   
The One-Electron Atom 105

2Zr  2  4πε0 
ρ = , a1 =
na1 mr  e 2 

where a1 is the radius of the first Bohr orbit for Z = 1, and L2nl++11 are the
associated Laguerre polynomials given by

di
Lji (ρ) = L j (ρ) (4.21)
d ρi
Lj (ρ) being the Laguerre polynomials,

ρ dj
Lj (r) = e (ρ j e− p ) (4.22)
dρ j
Here Rn, l (r) are normalized to satisfy the condition
∫ [Rn,l (r)]2 r2dr = 1 (4.23)
Collecting the radial and angular parts, the solutions are
φn,l,m (ρ, θ, φ) = Rn,l (r) Ylm (θ, φ) (4.24)
with
2
mr  Ze 2  1
En = –  4πε  2 (4.25)
2 2  0  n

The first four solutions are


3/ 2
1 Z
φ1,0,0 (r) = 1/ 2 a  exp (− Zr / a1 )
π  1
3/ 2
1 Z  Zr 
φ2,0,0 (r) = a   2 − a  exp ( − Zr / 2a1 )
(32π)1/ 2  1  1 

3/ 2
1 Z Zr
φ2,1,0 (r) = a  exp ( − Zr / 2a1 ) cos θ (4.26)
(32π)1/ 2  1 a1

3/ 2
1 Z Zr
φ2,1,1 (r) = a  exp (− Zr / 2a1 ) sin θ exp (iφ)
(64π)12  1 a1

The wave functions of one-electron atom are characterized by three quantum


numbers. These are the principal quantum number n, the angular momentum
quantum number l which determines the angular momentum, and the magnetic
106 Elements of Modern Physics

quantum number m which determines the z-component of the angular momentum.


However, the energy En depends on only the principal quantum number n. This
is a special property of the attractive 1/r potential. Thus, there is the accidental
degeneracy that there are several states with different l values but the same
value of n, which have the same energy. It then follows that since for a given l,
there are 2l + 1 states with m = 0, ± 1, ..., ± l, and for a given n there are n states
with l = 0, 1, ..., n – 1 all of which have the same energy, the degeneracy of the
states with energy En is
n −1

∑ (2l + 1)
l =0
= n2 (4.27)

Actually, the l-degeneracy gets multiplied by another factor of 2 due to the


fact that the electron has an intrinsic angular momentum called spin (this is
discussed later) which allows it to be in two independent spin states with the
same energy.
In the Schrödinger description of the atom, there are no particle trajectories.
The wave functions predict only the probability for finding the electron at various
distances from the nucleus. Even then, it may be expected that the average
values of radial distance have some correspondence with the distances of Bohr
trajectories. Defining the average values as
〈 rn 〉 = ∫|φ (r)|2 rndτ (4.28)
gives, after some involved calculations (for details see Ref. 14)

a1n 2  1  l (l + 1)  
〈r〉 =
Z 1 + 2 1 − n 2   (4.29)
  

1 Z
〈 〉 = (4.30)
r a1n 2

1 Z2
〈 〉 = (4.31)
r2 1
a12 n3  l + 
 2

1 Z3
〈 3 〉 = , for l > 0 ( 4.32)
r 1
a13 n3 l (l + ) (l + 1)
2
Only for 1/r is the average value the same as the corresponding value for
the Bohr orbits.
The energy levels of the one-electron atom, deduced from the Schrödinger
equation, are the same as those obtained from the Bohr model. However, it
The One-Electron Atom 107

should be appreciated that the results of the Bohr model follow from ad-hoc,
though interesting, assumptions, while those from the Schrödinger equation are
based on fundamental physical principles.
The solutions of the nonrelativistic Schrödinger equation for the one-electron
atoms provide a satisfactory basis for the understanding of the general features
of the energy spectra of these atoms. However, these spectra have a fine
structure, to understand which small relativistic corrections should be included
and additional properties for the electron proposed.

4.2 ELECTRON SPIN


The l-degeneracy of the energy levels of the one-electron atom is removed by
any additional interaction which is noncoulombic in form. Such an interaction
may be provided by the relativistic corrections to the Schrödinger equation, or
by the non-point structure of the nucleus. As a consequence, the energy levels
for a given n split into several closely-spaced levels corresponding to different
l values. This leads to multiplets of spectral lines which are described as the
fine structure of the spectrum.
There are some fine-structure lines which cannot be described in terms of
the l-multiplets. Examples of these are the alkali metal spectra which show
doublets of closely spaced lines. Prominent among these is the sodium yellow
line for the (n = 3, l = 1) transition to (n = 3, l = 0), which actually consists of two
closely-spaced lines of wavelengths 5890 Å and 5896 Å. To explain these splitting,
Goudsmit and Uhlenbeck (1925) proposed that the electron has an intrinsic
angular momentum called the spin (the origin of spin is not the spatial motion of
the electron), and an associated magnetic moment. If S describes the spin of the
electron, the associated magnetic moment is
µ = –b S (4.33)
where, since the electron has negative charge, it is assumed that b is positive
constant of proportionality. It is the interaction of this magnetic moment with the
magnetic field seen by the electron due to the motion of the nucleus around it,
that contributes to the fine structure of the atomic energy levels.
A definitive support to the hypothesis of spin is provided by the experiment
of Stern and Gerlach (1922). In this experiment, a beam of neutral silver atoms
was passed through an inhomogeneous magnetic field in the z direction
(Fig. 4.1). If the atom has a magnetic moment µ, it has a potential energy
µ⋅ B
U = –µ (4.34)
in this field, and is subjected to a force in the z-direction,
∂B
Fz = µ x (4.35)
∂z
108 Elements of Modern Physics

Fig. 4.1 Stern-Gerlach experiment to determine the spin


components of the silver atom.
From the deflection of the beam on the screen, the value of µz can be
obtained. Classically, the magnetic dipole moments would be randomly oriented
which would give just a spreading of the beam. Stern and Gerlach observed,
however, a splitting of the beam into two discrete components, indicating the
existence of only two possible values of µz. Since the silver atom has one electron
in the outermost shell, this suggests that the electron spin and its magnetic moment
in Eq. (4.33) have only two possible values along a given direction. Furthermore,
µz was found to have values

e
µz = ± (4.36)
2me

e / 2me is called the Bohr magneton for the electron, (e  /2me ≈ 9.2731
× 10–24 m2 Cs–1).
It was observed in Eq. (3.152) that the possible eigenvalues of L2 are
l (l + 1)  2 and those of Lz are m where m = l , (l – 1)  , ..., – l . If this
property is assumed to be valid for the spin angular momentum as well, it follows
that since Sz has only two eigenvalues [see Eqs. (4.33), (4.36)], Sz and S2 have
eigenvalues

1
S z = ms  , ms = ± (4.37)
2

1
S 2 = s (s + 1)  2 , s = (4.38)
2
It is now easy to deduce from Eqs. (4.37), (4.36) and (4.33) that

e
µ = – S (4.39)
me
The One-Electron Atom 109

These relations introduce the new idea of intrinsic spin angular momentum
whose quantum numbers take half-integral values in contrast to the integral
values taken by the quantum numbers of angular momentum originating from
the spatial motion of particles. The relation in Eq. (4.39) also differs (by a factor
of 2) from the relation

e
µ = – L (4.40)
2me
expected for the magnetic moment of a negatively charged particle moving
around in a circle with angular momentum L. All in all, the spin of an electron,
with half-integral values for its quantum numbers, is a revolutionary idea with no
classical analogue. It has found strong support not only in the wealth of
experimental data it can explain, but also in the elegant formulation of the
linearized relativistic equation of Dirac (1928) describing a spin 1/2 particle.

4.3 TOTAL ANGULAR MOMENTUM


The total angular momentum consists of two parts, the orbital angular momentum
L and the spin angular momentum S. Designating the total angular momentum
by J,
J = L+S (4.41)
As in the case of the orbital angular momentum, quantum numbers mj and j
can be associated with J, such that Jz and J2 have eigenvalues
J z = m j (4.42)

J 2 = j (j – 1)  2 (4.43)
It follows from Eq. (4.41), that
mj = ml + ms (4.44)
so that mj has integral values if ms has integral values, and half-integral values if
ms has half-integral values (ml is used in place of m to have a more symmetric
notation). For deducing the possible values for j, it is assumed that l ≥ s. It is also
noted that the magnitude of J is not affected by the choice of the direction of L,
i.e. the choice of ml. Taking the largest possible value ml, = l, gives
mj = l + ms (4.45)
Thus the largest and the smallest values of mj are l + s and l – s. This result,
along with a similar analysis for l ≤ s, implies that the allowed values of j and mj,
are
|l–s| ≤ j≤l+s (4.46)
mj = j, j – 1, ..., – j (4.47)
110 Elements of Modern Physics

It therefore follows from Eqs. (4.47) and (4.44), that j takes on integral
values if s takes integral values, and half-integral values if s takes half-integral
values.
An electron n an atom is characterized by the quantum numbers n, l, s
and j. In spectroscopic notation, the state of such an electron is designated by
n2s + 1 Lj (4.48)
The superscript 2s + 1 gives the multiplicity of the state for l ≥ s, as can be
deduced from Eq. (4.46). The subscript in the notation describes the total angular
momentum. In place of L, a letter which conventionally denotes a particular
orbital angular momentum is used, e.g. s, p, d, f, g and h for l = 0, 1, 2, 3, 4, and
5, respectively. These small case latters are used to describe the states of
individual electrons. For the states of the atom, capital letters S, P, D, F, G, H are
used instead.

Table 4.1 Spectroscopic letters for different l values, small case letters for
electron states and capital letters for atomic states

Values of l → 0 1 2 3 4 5
Letter symbol → s, S p, P d, D f, F g, G h, H

The ground state of sodium is described by an electron in the n = 3, l = 0,


j = 1/2 state, and hence the atomic ground state may be described by 32S1/2.
The excited electron in the n = 3, l = 1 state can have j = 3/2 or j = 1/2 and the
two states would be designated by 32P3/2 and 32P1/2. If, as will be seen, these
two states have different energies, one would observe a doublet of lines
corresponding to transitions 32P3/2 → 32S1/2 and 32P1/2 → 32S1/2. This provides
a basis for the explanation of the observed doublet of sodium lines with
wavelengths 5890 Å and 5896 Å.

4.4 FINE STRUCTURE OF ONE-ELECTRON ATOMIC


SPECTRA
In this section, the fine structure of the one-electron atomic spectra is considered.
The fine structure arises from the small corrections due to essentially relativistic
effects.
It was noted in Sec. 4.2 that an electron has an intrinsic magnetic moment
µ = – (e/me) S. This interacts with the magnetic field seen by the electron in its
rest frame, due to the motion of the nucleus around it. Since the nucleus moving
in a circle of radius r produces a circular current I = Zer/2πr, the magnetic field
seen by the electron is
The One-Electron Atom 111

 Ze  mcr × v
B =  2  3 (4.49)
 4πε0 me c  r
Therefore, the energy of the spin magnetic moment of the electron interacting
with this field is
V 1′ = – µ ⋅ B

 Ze 2  S.L
=  2 2  3 (4.50)
 4πε0 me c  r
However, this is the energy seen in the frame of the electron which is being
accelerated. It was shown by Thomas that the corresponding energy in the rest
frame of the nucleus, is smaller by a factor of 1/2, so that the first correction to
the nonrelativistic energy is (for details see Ref. 1)

 Ze 2  S.L
V1 =  2 2  3 (4.51)
 8πε0 me c  r
It is easy to see that this term is smaller than the angular momentum term in
Eq. (4.8) by an order of magnitude of (Ze2/4πε0r) (1/mcc2), i.e. ratio of binding
energy to rest energy. This illustrates that the spin-orbit interaction, given in
Eq. (4.51), is a relativistic effect and gives corrections which are smaller than
the nonrelativistic energies by an order of about 10–5. This correction is there
only for the states with l ≠ 0.
The second correction is obtained from using the relativistic expression for
the kinetic energy
T = (p2c2 + me2c4)1/2 – mec2

1 2 1
≈ p − p 4 + ... (4.52)
2me 8me3c 2
Thus, the leading correction gives rise to an extra term for the energy,

1
V2 = − p4 (4.53)
8me3c 2
which is smaller than the kinetic energy by a factor of about (p2/2me) (1/2mec2).
Hence, this term also gives rise to corrections which are smaller by a factor of
about 10–5 than the nonrelativistic energies. Being negative, it will lower the
energy of all the states.
Finally, there is an additional correction which follows from the relativistic
Dirac equation. It is called the Darwin term and has the form
112 Elements of Modern Physics

πZ e2  2
V3 = δ(r ) (4.54)
8πε0 me2 c 2
Because of the presence of the Dirac delta function δ (r), this term
contributes only to the l = 0 states (δ (r) = 0 for r ≠ 0 but ∫δ (r) dτ = 1), since the
wave functions of states with l ≠ 0 vanish at r = 0. This term is of the same
order as the spin-orbit interaction, and therefore contribute corrections of the
order of 10–5 compared to the leading terms.
Collecting all the corrections together, the additional energy is
V = V1 + V2 + V3

 Ze 2  S.L 1 4 πZ e2  2
=  2 2  3
− p + δ(r ) (4.55)
 8πε0 me c  r 8me3c 2 8πε0 me2 c 2
where the contribution of the first term is only to the l ≠ 0 terms. Since these
terms are small, their contribution to the energy levels can be evaluated by using
the first order perturbation theory described in Sec. 3.10 as
∆E n ≈ ∫φ n* Vφ ndτ (4.56)
For the evaluation of the contribution of V1 to ∆En, we note that

1
S.L = [(L + S)2 – L2 – S2] (4.57)
2
This, together with the value of 〈1/r3 〉 given in Eq. (4.32), leads to
j ( j + 1) − l (l + 1) − 3/ 4
〈V1〉 = Z2α2 |En| ,l≠0 (4.58)
nl (2l + 1) (l + 1)
= 0, for l = 0
2
( )
where α is the fine structure constant e / 4πε 0 c and has an approximate
value of (1/137). The contribution of V2 is obtained from
2
1  Ze 2 
〈 –p4/8me3c2 〉 = − 〈  H0 + 〉
2me c 2  4πε 0 r 

1  2  Ze 2  
2
Ze 2
= −  E + 2 E 〈 〉 + 〈  4πε r  〉  (4.59)
2me c 2 
n n
4πε 0 r  0 

Using relations (4.30) and (4.31) gives

Z 2 α 2 | En |  4n 
〈 V2 〉 = 2  3 − l + 1/ 2  (4.60)
4n  
The One-Electron Atom 113

Finally, the contribution of the Darwin term depends on the wave function
3/ 2
1  Z 
at the origin. Detailed analysis shows that ψ(0) = 1/ 2   δl 0 which then
π  a1n 

leads to

Z 2α 2 | En |
〈 V3 〉 = for l = 0
n

= 0 for l ≠ 0 (4.61)
These relations allow us to obtain ∆En,

Z 2 α 2 | En |  4n 
∆E n =  3 − j + 1/ 2  (4.62)
4n 2  

for the fine structure of the energy levels of the hydrogen atom.
The important properties of the fine structure given in Eq. (4.62) and
demonstrated in Fig. 4.2 are:
1. As expected, the fine structure corrections are smaller than En by a factor
of about α2/4 ~ 10–5. The hydrogen atom energies En themselves may be
written as En = –α2mc2/2n2, which means that they are smaller than the
rest energy by a factor of about α2. All the shifts in the energy levels due
to fine structure corrections are negative and the shift decreases as
j increases. Furthermore, the corrections decrease rapidly as n increases,
so that its effect is more easily noticeable for small-n states.
2. The fine structure corrections remove some of the degeneracy of the
energy levels En. The states with different j values now have different
energies. For a given n, the allowed j values range from 1/2 to (n –1/2) so
that each n level is now split into n levels.
3. Some degeneracy still survives. For a given n, the level with j = n – 1/2 is
nondegenerate but all the other levels have a degeneracy of order two
corresponding to l = j ± 1/2. For example, for n = 2, 2 2P3/2 is nondegenerate
but 22P1/2 and 22S1/2 are degenerate. Actually there is a small separation
between the 22P1/2 and 22S1/2 states also, known as the Lamb shift, which
can be satisfactorily explained in terms of quantum electrodynamics.
114 Elements of Modern Physics

Fig. 4.2 Schematic diagram of the fine structure of the hydrogen


levels (greatly exaggerated), showing some of the transitions
allowed by the selection rules in Eq. (4.63).

4. Not all the transitions between the different levels are allowed. As will be
seen later, there are selection rules for the allowed transitions. For the
most prominent transitions, called the electric dipole transitions, the allowed
transitions satisfy the selection rules
∆l = ± 1
∆j = ± 1, 0, but not j = 0 → j = 0
∆mj = ± 1, 0 (4.63)
∆n = unrestricted
Thus, these transitions are allowed only between adjacent columns in
Fig. 4.2. For example, there are two lines (a doublet) corresponding to transitions
between 2P and 1S levels, two lines (a doublet) corresponding to transitions
between 3S and 2P levels and three lines (a triplet) corresponding to transitions
between 3D and 2P states.
The One-Electron Atom 115

4.5 HYPERFINE STRUCTURE


In the discussion so far, the nucleus of the atom was assumed to be a point
particle without any structure. However, this assumption is insufficient to explain
many experimental results, for example, the observation of hyperfine structure
of atomic levels using high resolution spectrographs. To explain these, Pauli
(1924) suggested that the nucleus also has an intrinsic angular momentum and
an associated magnetic moment. These properties have now been firmly
established by several experiments, and are essential elements in the description
of atoms and nuclei.
Let I be the spin of the nuleus, with eigenvalues
I z = mi 

I 2 = I ( I + 1)  2 (4.64)
Associated with I is a magnetic moment µN,

e
µN = g m I (4.65)
p

where mp is the mass of the proton. Because the structure of the nucleus is
more complicated than that of an electron, the value of g is generally different
from 1, and is 2.79 for the proton. The nuclear magnetic moment is seen to be
smaller than the electron magnetic moment by a factor of about me/mp ~ 1/1000.
The atomic states are now designated by the total angular momentum F,
F = J+I (4.66)
with eigen values

(
F z = m j + mi  ) (4.67)

F 2 = F ( F + 1) 2 , | j − I | ≤ F ≤ j + I
This means that each level with a given j has a multiplicity of 2I + 1 if j > I
and a multiplicity of 2j + 1 if j ≤ I. The allowed electric dipole transitions are
found to satisfy the selection rules
∆l = ± 0
∆F = ± 1, 0 but not F = 0 → F = 0 (4.68)
∆mF = ± 1, 0
The nuclear magnetic moment interacts with the magnetic field created at
the nucleus by the electron. The magnetic field is due to (i) the orbital motion of
the electron around the nucleus, and (ii) the intrinsic magnetic moment of the
116 Elements of Modern Physics

electron. The magnetic field at the nucleus, due to the orbital motion of the
electron can be deduced from Eq. (4.49) as

e
B = − (L / r 3 ) (4.69)
4πε 0 me c 2
Therefore, the interaction energy due to this field is

ge2
〈 Vorb 〉 = 2
I.L 〈 1/ r 3 〉 for l ≠ 0 (4.70)
4πε0 me m p c

= 0 for l = 0
This is smaller than the fine structure terms by a factor of about me/mp
~ 1/1000.
For calculating the field due to the intrinsic magnetic moment of the electron,
we note that the vector potential due to a magnetic dipole moment µ is

1 r
A = µ ×  3  (4.71)
4πε0 c 2 r 
From this, the magnetic field comes out as
B = ∇×A

1   r  r
=
4πε0 c 2 µ  ∇. r 3  − (µ.∇) r 3  (4.72)
 
Therefore, the energy of the nuclear magnetic moment µN interacting with
this field is

1  r r 
V spin = − 2
∇.  (µ N .µ) 3 −  µ N . 3  µ  (4.73)
4πε0 c  r  r  
With this, the perturbative expression for the interaction energy comes out
to be

1  µ .µ µ .r µ.r 
〈 Vspin 〉 = 〈  N3 − 3 N 5  〉 for l ≠ 0 (4.74)
4πε 0 c 2  r r 
For l = 0, the angular integration in Eq. (4.74) gives zero for r ≠ 0. For
obtaining the correct value of the contribution from r = 0, Gauss theorem is used
in the expectation value of the expression in Eq. (4.73), to get

1 8π
〈 Vspin 〉l = 0 = − (µ N .µ) | ψ (0) |2
4πε 0c 2 3
The One-Electron Atom 117

2 ge 2
= 2
(I.S) | ψ (0) |2 (4.75)
3ε0 me m p c

where ψ (0) is the wave function at the origin. In particular, using the wave
functions in Eq. (4.26), the hyperfine splitting between the F = 1 and F = 0 levels
of the ground state of the hydrogen atom is

16me
E1 (F = 1) – E1 (F = 0) = ( g α 2 ) | E1 | (4.76)
3m p

Transition between these states leads to a spectral line with a frequency


E1 ( F = 1) − E1 ( F = 0)
v =
h
= 1.420 × 109 s–1 (4.77)
which corresponds to a wavelength of about 21.1 cm. This is the famous
2l cm line observed by the radio astronomers in the spectrum of interstellar
hydrogen.
The discussion of the hyperfine structure is concluded with the following
comments:
1. Nuclear spin introduces an additional multiplicity of atomic levels. For the
hydrogen atom, each energy level acquires and additional multiplicity of 2.
The magnetic moment associated with the nuclear spin introduces a
hyperfine splitting between the levels. These splittings are about 1000
times smaller than the fine structure splittings.
2. The hyperfine splitting may be observed in a high resolution spectrograph
fitted with accessories like Fabry-Perot etalons, as a hyperfine structure.
Transitions between hyperfine levels corresponding to microwaves may
be observed in nuclear magnetic resonances (discussed in Chapter 6).
They are also observed as stimulated emissions in a maser where the
population has been inverted (discussed in Chapter 6).
3. The hyperfine splitting can be measured to a very high accuracy in the
case of transition of the hydrogen atom in the ground state, between the
F = 1 and F = 0 levels and of the 133Cs atom in the 6S1/2 ground state,
between the F = 4 and F = 3 levels. These correspond to frequencies
v = 1.4204057518 × 109 s–1 and v = 9.192631770 × 109 s–1, respectively
and are used as time standards of atomic clocks.
118 Elements of Modern Physics

4. The v = 1.420 × 109 s–1 frequency radiation (known as the 21 cm hydrogen


line) corresponding to the hyperfine transition between the ground state
levels with F = 1 and F = 0, is used to study the distribution and motion (in
terms of Doppler shift) of the hydrogen in interstellar and intergalactic
space.

4.6 EXAMPLES OF ONE-ELECTRON ATOMS


Some examples of one electron atoms are now considered and their special
properties discussed.
Hydrogen
For hydrogen, Z = 1. The l-degeneracy is removed by the relativistic
corrections producing fine structure (Fig. 4.2). The j-degeneracy is removed by
quantum electrodynamic effects (Lamb shift). The spin of the proton doubles
the number of states, while the interaction of the magnetic moment of the proton
with the magnetic field produced by the electron, produces hyperfine structure.
The microwave radiation from transition between the two hyperfine levels of
the ground state, is especially important in producing maser action, in the
investigation of interstellar and extragalactic hydrogen, and as a time standard
in atomic clocks.
Heavier isotopes of hydrogen, such as deuterium and tritium, have spectra
very similar to the hydrogen spectrum, except for small differences due to slightly
different reduced masses (the energy levels are lower for larger reduced
masses). However, the hyperfine structure will be quite different since the spin
and the magnetic moment of their nuclei are quite different from those of the
proton.
Atoms with Z > 1
The singly ionized He (Z = 2) atom and doubly ionized Li (Z = 3) atom,
have energy levels which are larger by a factor of Z2. Therefore, their energy
levels with principal quantum numbers n′ = nZ will be similar to those of the
hydrogen atom with principal quantum number n, except for small differences
due to different reduced masses. The fine structures will be significantly
larger, since they vary as Z4.
Positronium
Positronium is a bound state of an electron and a positron which has the
same mass as an electron and an equal but opposite charge as an electron (see
Sec. 4.8). They are formed when a beam of positrons is stopped by a gas.
Since the reduced mass for the positronium is me/2, its energies will be
nearly half those of the hydrogen atom. However, the magnetic moment of a
positron is equal in magnitude to that of the electron, and hence much greater
The One-Electron Atom 119

than that of the proton. Therefore, the hyperfine interaction is much larger for
the positronium. In particular, the separation between the F = 1 and F = 0 levels
corresponds to a frequency of v = 2.034 × 1011 s–1. The electron and the positron
in the positronium annihilate each other, emitting two photons in the F = 0 state
and three photons in the F = 1 state. The lifetimes of the positronium in these
two states are different: 1.25 × 10–10 s for the F = 0 state and 1.4 × 10–7 s for the
F = 1 state.
Muonium
Muonium is a bound state of an electron and a µ+ meson (µ meson or muon
and µ+ meson are similar to an electron and a positron respectively, except that
they are heavier, their mass being about 206.84 me). They are produced when a
beam of µ+ is stopped by a gas. Their energy levels are similar to those of the
hydrogen atom except for a small difference due to the difference in the reduced
mass. The hyperfine energy levels of the muonium are of importance since they
can be calculated precisely, and serve as a test of the theory.
Muonic Helium
Muonic helium is formed by replacing one of the electrons in a helium atom
by a muon. Since the Bohr radius of the muon is smaller by a factor of about 207
than that of the electron, the electron essentially sees a nucleus of charge 2|e|
with a muon moving around close to the nucleus. Therefore, the energy levels
are similar to those of the hydrogen atom. However, the hyperfine splitting is
due to the electron magnetic moment interacting with the muon magnetic moment.
Using Eq. (4.76) as a first approximation but taking g = 1 and replacing mN by
mµ, it is found that the hyperfine splitting for muonic helium corresponds to a
frequency of v = 4.515 × 10 9 s –1, close to the experimental value of
4.465 × 109 s–1.
Muonic atoms are very useful for probing the structure of nuclei since the
Bohr radius of the muon is quite small, and therefore the probability of finding
the muon inside the nucleus may be quite substantial.
Rydberg Atoms
When an electron in an atom is in a state with a sufficiently large principal
quantum number n, it is influenced mainly by the net positive charge of the ionic
core and not by its distribution. These excited states of atoms are similar to
those of a hydrogen atom. They are termed Rydberg states and the atoms are
called Rydberg atoms. It is the advent of tunable lasers (see Sec. 6.5) that has
helped to excite and investigate the Rydberg states. They are of interest for the
following reasons:
120 Elements of Modern Physics

1. The departures of the energy spectrum of Rydberg atoms, from the


hydrogenic spectrum, provide useful information about the structure of
the ionic core and the interaction between the core and the valence electron.
2. Their behaviour in the presence of external fields can be understood by
direct extensions of the analysis for the hydrogen atom. It may also be
noted that because of the large size of Rydberg atoms, the effects of the
external fields are greatly enhanced.

4.7 SCHRÖDINGER EQUATION FOR SPIN 1/2 PARTICLES


It is clear from the previous discussion that a function which only depends on
spatial coordinates, cannot adequately describe the spin and magnetic moment
of an electron. An additional variable that describes the spin state of an electron
or more generally, any particle with spin 1/2, has to be introduced. Furthermore,
e
one would like the introduction of the associated magnetic moment of – S to
m
be more appealing.

1
The spin eigenstates of Sz with eigenvalues ±  are designated by α and
2
β, so that

1 1
S zα = α , Szβ = – β (4.78)
2 2
The spin operator S has the properties,

3 2
S 2 = s (s + 1)  2 =  (4.79)
4

1 2
S x 2 = Sy2 = Sz2 =  (4.80)
4
Furthermore, the raising and lowering operators S± defined as
S ± = Sx ± iSy (4.81)
satisfy the properties [see Eq. (3.165)]
S+ β = b1α , S+ α = 0

S – α = b2β , S–β = 0 (4.82)


where b1 and b2 are constants which may be taken to be 1 (see Example 5).
The operation of S is then determined by Eqs. (4.78) and (4.82). It follows from
Eq. (4.82) that
S + 2 = S–2 = 0 (4.83)
The One-Electron Atom 121

i 2
and also S x S y + SySx = – ( S − S−2 ) = 0 (4.84)
2 +
Together with the commutation relation SxSy – SySx = iS z satisfied by all
angular momentum operators [see Eq. (3.163)], this implies

i
S x S y = –SySx = S (4.85)
2 z
By symmetry,

i
S y S z = –SzSy = S (4.86)
2 x

i
S z S x = –SxSz = S
2 y
For writing down the Schrödinger equation for a free, spin 1/2 particle of
mass m, it is proposed that the kinetic energy be written as
2
E = (p ⋅ S) (p ⋅ S) (4.87)
m 2

1
This expression is equivalent to p2 for the free particle, as can be
2m
shown by using Eqs. (4.80), (4.85) and (4.86). On using the operator expressions
for E and p, it leads to the Schrödinger equation for a free particle,
∂ψ 2
i = – (∆ ⋅ S) (∆ ⋅ S)ψ (4.88)
∂t m
The interaction with the electrostatic potential φ, can be introduced by the
prescription that
E → E – qφ (4.89)

1
where q is the charge of the particle. However, since both  p, Etot  and
 c 

 A, 1 φ  transform as relativistic 4-vectors, requirements of relativistic


 c 
 
covariance imply that the prescription in Eq. (4.89) should be accompanied by
the replacement
p → p – qA (4.90)
The fact that only the kinetic energy appears in Eq. (4.87) does not alter the
essential results since the addition of a constant to the energy only redefines the
122 Elements of Modern Physics

zero of the energy. Equations (4.89) and (4.90) introduce what is known as the
minimal electromagnetic interaction. With these prescriptions, the Schrödinger
equation for a spin 1/2 particle in the presence of electromagnetic fields, comes
out as

∂ψ 2
i = [(−i∇ − qA) ⋅ S]2 ψ + qφψ (4.91)
∂t m 2

where ψ = ψ1 (r, t) α + ψ2 (r, t)β (4.92)


Using Eqs. (4.80), (4.85) and (4.86), this reduces to

∂ψ 1 q
i
∂t
=
2m
( −i∇ − qA )2 ψ m S. (∇ × A + A × ∇)ψ + qφψ
(4.93)
Finally, noting that ∇ in ∇ × A operates on A as well as on ψ, and writing
V for qφ, gives

∂ψ 1 q
( −i  ∇ − q A ) ψ − S ⋅ B ψ + V ψ
2
i = (4.94)
∂t 2m m
This expression for the energy contains a term which corresponds to the
interaction of a particle with magnetic moment
q
µ = S (4.95)
m
with the external magnetic field B. Thus, the particle which satisfies Eq. (4.91),
has an intrinsic spin S and an associated magnetic moment given by Eq. (4.95).
For the electron q = –|e|. These results are in conformity with the experimental
observations discussed in Sec. 4.2.

4.8 DIRAC EQUATION


The spin and the magnetic properties of an electron can be discussed in terms
of the Schrödinger equation given in Eq. (4.91). However, it has been pointed
out earlier that the fine structure is essentially of relativistic origin. Therefore, a
relativistic description of the spin 1/2 particle has to be considered to provide a
satisfactory explanation of the fine structure.
The relativistic equation for a spin 1/2 particle of mass m may be obtained
from the relation

4c 2
E2 = (p ⋅ S) (p ⋅ S) + m 2 c 4 (4.96)
2
which is equivalent to E2 = p2c2 + m2c2. On taking the momentum term to the
left hand side and factorizing, it leads to the equation
The One-Electron Atom 123

 i ∂ − 2ic∇ ⋅ S   i ∂ + 2ic∇ ⋅ S  ψ
 ∂t   ∂t  = m2c2ψ (4.97)
  
where ψ has the form given in Eq. (4.92). It can be linearized by defining

 i ∂ − 2ic∇ ⋅ S  ψ
 ∂t  = mc2x (4.98)
 
substitution of which in Eq. (4.97) leads to

 i ∂ − 2ic∇ ⋅ S  x
 ∂t  = mc2ψ (4.99)
 
Equations (4.98) and (4.99) together are equivalent to the Dirac equation
(1928) for a spin 1/2 particle.
The free particle solutions can be written by noting that
ψ = (b1α + b2β) exp [– i (Et – k ⋅ r)]  ] (4.100)
satisfies Eq. (4.97) provided
E 2 = k2c2 + m2c 4 (4.101)
The corresponding x is obtained from Eq. (4.98), as

1  2c 
x =  E −  k ⋅ S  (b1α + b2β) exp [−i ( Et − k ⋅ r ) / ]
mc 2  
(4.102)
The most striking property of these solutions is that negative energy solutions
with E = – (k2c2 + m2c4)1/2 are allowed in addition to the usual positive energy
solutions with E = (k2c2 + m2c4)1/2.
Further discussion of the Dirac equation is not within the scope of this book.
We will be content with making a few remarks.
1. The problem of one-electron atoms can be considered by the replacement

∂ ∂ Ze 2
i → i + (4.103)
∂t ∂t 4π ε 0 r

in Eqs. (4.98) and (4.99). The various fine structure terms can then be
deduced by carrying out suitable expansions.
2. The existence of negative energy states creates some complications. Since
no negative energy particles are observed in nature, how are possible
transitions to negative energy states explained ? Dirac overcame this
difficulty by postulating that vacuum consists of a sea of electrons which
fill all the negative energy levels. Hence, transitions to negative energy
124 Elements of Modern Physics

states are forbidden by the Fermi-Dirac statistics (see Chapter 7) according


to which no two identical particles with half integral spin can occupy the
same state [Fig. 4.3 (a)].

2
E = mc

E=O

E = – mc2

(a) (b)

Fig. 4.3 (a) Forbidden transition to a filled negative energy state, and
(b) allowed transition leading to annihilation of an electron and a hole with the
emission of two photons. Filled circles are electrons and the open
circle is a hole or a vacancy.

3. The Dirac hypothesis of the negative energy sea suggests that if enough
energy is provided, a negative energy electron may become a positive
energy electron. The energy of vacuum can be written as

Ev = ∑ (– E )
i
i (4.104)

where the summation is over all the negative energy states (ignoring the
complication of the infinite sum). Suppose two photons with energies hv1
and hv2, come together and give all their energy to an electron with energy
–El, which now has a positive energy En. Then, by energy conservation

hv1 + hv2 + ∑ (– E ) = E + ∑ (− E )
i n i = En + El + ∑ (− E )
i
i
i i≠l
(4.105)
The final state consists of an electron with energy En and another particle
with energy El corresponding to the hole in the negative energy sea. A
The One-Electron Atom 125

similar analysis for charge conservation shows that the final state consists
of a negatively charged electron (with energy En) and a positively charged
particle (corresponding to the hole in the negative-charge sea). The hole
therefore has properties exactly opposite to those of the vacant negative
energy electron state, i.e. it has positive energy and positive charge (also
opposite momentum and spin). This hole state is called the positron, which
is an example of what are known as antiparticles. The overall process is
equivalent to two photons annihilating each other to produce an electron
and a positron. The process is known as pair creation. Similarly, pair
annihilation occurs when a positive energy electron drops into a vacancy
in the negative energy sea, giving out radiation [Fig. 4.3 (b)]. Pair creation
and annihilation are important processes in particle physics, though the
associated particles may not always be two photons or electrons.
4. The presence of an external charge polarizes the sea of negative charges
thus reducing the effective charge of the external particle. This is called
vacuum polarization. As a result, an s-wave electron in the hydrogen
atom, which is ‘nearer’ to the proton than a p-wave electron, sees a greater
charge for the proton. Hence, the vacuum polarization lowers the s-wave
levels compared to the p-wave levels. This contributes to the removal of
j-degeneracy, in particular, the degeneracy between 2p1/2 and 2s1/2 states.
However, there are additional contributions to the separation of these energy
levels called the Lamb shift, for other effects such as self interaction, field
fluctuations, etc. which can be treated within the framework of quantum
electrodynamics. The predictions of the theory for the Lamb shift are in
excellent agreement with the experimental observations.

4.9 EXAMPLES
A few examples to illustrate the properties of the one-electron atoms are discussed
here.

Example 1
Though the solutions to the radial equation, Eq. (4.13), are in general complicated,
the solutions for l = n – 1, are fairly simple.
Consider a solution of the form

R (r) = rn–1 exp [– (–2mrE/  2 )1/2r] (4.106)


substitution of which in Eq. (4.13) gives the relation
126 Elements of Modern Physics

2  2 1/ 2  1  n(n − 1) l (l + 1)  Ze2
2mr  2n (−2mr E /  )  r  − r 2 + r 2  − 4πε0 r = 0

(4.107)
This relation can be satisfied if
l = n – 1,
2
m  Ze 2   1 
E = − r2  4πε   2  (4.108)
2  0  n 

The corresponding solutions normalized according to Eq. (4.23), is


1/ 2 n + 1/ 2
 1   2Z 
R n,n–1 =   an rn–1 exp (–rZ/a1n) (4.109)
 (2n)!   1 
where a1 is the radius of the first Bohr orbit with Z = 1.

Example 2
The scaling properties of Eq. (4.13) provide a useful insight into the solutions.
Consider a transformation
r → λr (4.110)
which takes Eq. (4.13) to the form
2
2  1 d 2 d l (l + 1)  (λZ )e
2mr  − r 2 dr r dr + r 2  R(λr ) − 4πε0 r R(λr )

= λ2ER (λr) (4.111)


Taking λ = 1/Z, the equation for Z = 1 is obtained. Hence,

1
R(r, 1) = R(r / Z , Z )
Z 3/ 2

1
E (Z = 1) = E (Z ) (4.112)
Z2
where Z is shown as an additional variable. The factor of Z–3/2 in the first relation
is due to normalization. Thus, we can obtain the solutions for Eq. (4.13) in terms
of solutions for Z = 1. It also follows that
1 n
〈 rn 〉 z = 〈r 〉 z = 1 (4.113)
Zn
The One-Electron Atom 127

Example 3
A beam of sodium atoms with velocity 103 m/s, moves along the magnetic poles
over a distance of 0.15 m, in the x-direction. We determine the separation between
the two components of the beam, at a distance of 0.6 m from the magnet, given
that the magnetic induction between the poles varies in the z-direction as
B = (1 – 100z) W/m2 (4.114)
The force on the atoms is

e
Fz = ± (100)
2me
= ± 9.28 × 10–22 N (4.115)
If t1 is the time taken by the atoms to traverse the poles and t2 is the time
taken to move from the magnets to the plane of observation, the separation
between the two components of the beam is

1  | F |  2  | F |  
∆z = 2   z  t1 +  z  t1t2  (4.116)
 2  mn   mn  
Since t1 ≈ 1.5 × 10–4 s and t2 ≈ 6.0 × 10–4 s, one gets
∆z = 4.9 × 10–3 m (4.117)

Example 4
Some general properties of fine structure lines are now enumerated.
1. Transitions from the level with principal quantum number n > 1, to the
ground state give rise to doublets corresponding to
np 1/2 → 1s1/2, np3/2 → 1s1/2 (4.118)
2. Transitions from the level with principal quantum number n > 2 to the level
with n = 2, gives rise to seven lines corresponding to
np 1/2, 3/2 → 2s1/2
ns 1/2 → 2p1/2, 3/2 (4.119)
nd 3/2 → 2p1/2, 3/2
nd 5/2 → 2p3/2
3. If n0 ≥ 2, where n0 is the principal quantum number of the final state,
every increase in n0 increases the number of fine structure line by 6.
These additional lines correspond to
128 Elements of Modern Physics

(n, l = n0 + 1, j = n0 + 3/2) → (n0 + 1, l = n0, j = n0 + 1/2)


(n, l = n0 + 1, j = n0 + 1/2) → (n0 + 1, l = n0, j = n0 ± 1/2)
(4.120)
(n, l = n0 – 1, j = n0 – 1/2) → (n0 + 1, l = n0, j = n0 ± 1/2)
(n, l = n0 – 1, j = n0 – 3/2) → (n0 + 1, l = n0, j = n0 ± 1/2)
(4.121)
where n0 ≥ 2, and n > n0 + 1. Hence, the number of fine-structure lines for
transitions from n → n0 + 1 states is
N = 7 + 6 (n0 – 2), n → n0 + 1, n > n0 + 1 ≥ 3 (4.122)
Thus, each line in the Paschen series consists of 13 lines, each line in the
Brackett series consists of 19 lines, etc.
4. For a transition n → n′, n > n′, the highest frequency corresponds to the
transition
(n, j = 3/2) → (n′, j = 1/2) (4.123)
and is given by

Z 2 α 2 | En | Z 2 α 2 | En ' |
v = v0 + (3 − 2 n ) − (3 − 4n ') (4.124)
4n 2 h 4n '2 h
where v0 is the frequency in the absence of fine structure correction given
by

En − En '
v0 = (4.125)
h

Example 5
The spin 1/2 space contains only two linearly independent states α and β as
1
defined in Eq. (4.78). It is convenient to regard these states with Sz = ±  as
2
two-component column vectors

1
α =   (4.126)
0

0
β = 1
 
The One-Electron Atom 129

and describe the spin operators by 2 × 2 matrices. These matrices must satisfy
the commutation relations in Eq. (3.163), and the properties stated in Eqs. (4.78),
(4.80), (4.85) and (4.86). One set of such matrices is given by

1 0 1
Sx = 
2  1 0 

1  0 −i 
Sy = 
2  i 0 

1 1 0 
Sz =  (4.127)
2  0 −1
The operation of the spin operator is then given by the operation of these
matrices on the column vectors given in Eq. (4.126). It is easy to see that these
matrices give the results in Eq. (4.82) with the specific choice of
b1 = b2 = 1.

PROBLEMS
1. Calculate the expectation value of 〈 – Ze2/4πε0r 〉 for the one-electron
atoms in the ground state and hence deduce the expectation value
〈 p2/2mr 〉 of the kinetic energy.
2. For the ground state of the hydrogen atom, the wave function is of the
form ψ = b exp (–r/a), where b is a constant and a is the Bohr radius.
Determine the probability of finding the electron at a separation greater
than 2a. What is the corresponding classical probability?
3. Consider a wave function of the form R(r) = (1 + br) e–gr for a one-
electron atom. What is the possible eigenvalue of this state?
4. What is the value of r at which 4πr2 |φ (r)|2 has a maximum for the
ground state? What is the probability density as a function of r, at this
value?
5. The effect of the finite size of a nucleus may be taken into account by
modifying the potential for r < rn, such that the potential V(r) = –Ze2/
4πε 0rn for r < r n, r n being the radius of the nucleus. Treating the
modification perturbatively, show that the correction to the ground state
 4rn2 Z 2 
energy is approximately  2  | E1 |. What is the order of magnitude
 3a1 
of this correction?
130 Elements of Modern Physics

6. Using scaling arguments, show that


E(mr) = mrE (1) and R(r, mr) = mr3/2 R(rmr, 1).

1
Hence show that 〈 rn〉 = 〈 r n 〉 mr = 1 .
mr
mrn

7. A µ– meson is similar to an electron except that its mass is 206.84 me. It


can form a mesic. Atom with a proton. Compare the energy levels and the
average radial distances 〈 r 〉 of such an atom with those of hydrogen.
8. Obtain the frequencies of the two Lyman lines corresponding to n = 2 → n = 1
transition. Verify that these frequencies satisfy the relation given in
Eq. (4.124).
9. What are the allowed electric dipole transitions between the fine structure
states with n = 3 and n = 2?
10. Enumerate the n = 2 levels of the hydrogen atom in terms of the total
angular momentum states, i.e. eigenstates of F2. What are the allowed
electric dipole transitions between these states?
11. Write Eqs. (4.98) and (4.99) in terms of ψ + χ and ψ – χ. Show that for
the free particle solutions with positive energy E and momentum p,

2c
ψ – χ= p . S (ψ + χ)
 ( E + mc 2 )

which vanishes for p → 0. Therefore, corrections to nonrelativistic equations


can be worked out conveniently in terms of ψ ± χ.
5
Atoms and Molecules

Structures of the Chapter


5.1 Exchange symmetry of wave functions
5.2 Shells and subshells in atoms
5.3 Periodic table
5.4 Atomic spectra
5.5 X-ray spectra
5.6 Molecular bonding
5.7 Molecular spectra
5.8 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 131
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_5
132 Elements of Modern Physics

In Chapter 4, it was shown, that the quantum-mechanical framework provides a


detailed and accurate description of the energy levels of the one-electron atom.
In this chapter, the attempts to understand the properties of many-electron atoms
and molecules will be discussed.
It is generally difficult to obtain accurate solutions to the problem of
N interacting particles, where N ≥ 3, whether in classical mechanics or in quantum
mechanics (simple harmonic potential is an exception). However, there is a
special symmetry property for spin 1/2 particle sin quantum mechanics, called
the Pauli exclusion principle, which gives a very satisfactory qualitative and
often quantitative understanding of many-electron atoms and molecules.

5.1 EXCHANGE SYMMETRY OF WAVE FUNCTIONS


In the treatment of two or more identical particles, such as electrons, within the
framework of quantum mechanics, the uncertainty principle limits our ability to
follow the motion of the particles without disturbing the system (in classical
mechanics this disturbance can be made indefinitely small). Therefore, in general,
it cannot be ascertained as to which of the identical particles has been found at
a place.
Consider the Hamiltonian H (1, 2, ...N) of N identical interacting particles,
where the numbers represent all the variables of the particles, i.e. spatial
coordinates and spin variables. Since the particles are identical, the following is
true:
H (1, 2 ..., i ..., j, ... N) = H (1, 2, ..., j, ..., i, ...N) (5.1)
Now, an operator Pij which interchanges all the coordinates of articles i and
j is defined as:
Pij ψ (1, 2, ... i, ..., j,...N) = H (1, 2, ..., j, ..., i, ...N) (5.2)
It is easy to show that Pij is a hermitian operator (See Eq. 3.35). Further-
more, in view of Eq. (5.1), it is seen that
PijH (1, 2, ... i, ..., j, ... N) = H (1, 2, ..., i, ..., j, ...N) Pij (5.3)
This means that states which are simultaneous eigenstates of H and Pij can
be chosen (see Sec. 3.4).
The eigenvalues of Pij can be deduced from Eq. (5.2) by noting that

Pij2 = 1 (5.4)
which implies that the possible eigenvalues of Pij are + 1 or –1. It is experimentally
observed that the physical states indeed are (not just ‘can be’) eigenstates of Pij
and the eigenvalues are characteristic of the nature of the particles. This result
is stated in terms of the following rules:
Atoms and Molecules 133

1. The wave function of a system is symmetric with respect to the interchange


of the space and spin variables of any two identical particles i and j, with
integral quantum number for their spin
ψ (1, 2, ..., i, ..., j, ... N) = ψ (1, 2, ..., j, ..., i, ...N) (5.5)
2. The wave function of a system is antisymmetric with respect to the
interchange of the space and spin variables of any two identical articles
k and l, with half-integral quantum number for their spin
ψ (1, 2, ..., k, ..., l, ...N) = – ψ (1, 2, ..., l, ..., k, ...N) (5.6)
The particles with integral quantum number for their spin are called bosons
(because of Bose-Einstein statistics which governs these particles, Chapter 7)
and the particles with half-integral quantum number for their spin are called
fermions (because of Fermi-Dirac statistics which governs these particles,
Chapter 7). Examples of bosons are the photon (spin 1), deuteron (spin 1),
π-meson (spin 0), etc. while examples of fermions are the proton (spin 1/2),
neutron (spin 1/2), electron (spin 1/2), Ω– (spin 3/2), muon (spin 1/2), etc. It
may be noted that the symmetry requirement for fermions with spin 1/2 and for
bosons can be deduced within the framework of quantum field theory with the
assumptions of Lorentz invariance, etc. and is known as the spin-statistics
theorem.
The symmetry properties stated in Eqs. (5.5) and (5.6) are of great importance
and give rise to significant macroscopic phenomena such as superfluidity of
liquid helium at low temperatures, blackbody radiation, magnetism, stimulated
emission of radiation, shell structure in atoms and in nuclei, etc. Here, the
implications of an antisymmetric wavefunction for the electrons, on the structure
and energy levels of atoms and molecules are discussed.
A simple model is now considered to illustrate the ideas of symmetrization
of states. In this model, the N identical particles do not interact with each other,
but each of these particles has the same external interaction. The Hamiltonian
for such a system is of the form
N
H = ∑  21m p
i =1
i
2
+ V (i ) 

(5.7)

The separable eigenstates of this Hamiltonian with energy E are of the


form

ψ (1, 2, ..., N) = ψ a1 (1) ψ a2 (2)...ψ aN ( N ) (5.8)

where ψ ai are the normalized solutions of the equation

 2 2 
 − 2m ∇i + V (i )  ψ ai (i ) = Eai ψ ai (i ) (5.9)
 
134 Elements of Modern Physics

with the total energy being


N
E= ∑E
i =1
ai (5.10)

It is clear that any permutation of the particle indices in the solution in


Eq. (5.8) will again give an eigenstate of the Hamiltonian, with energy E. Since
all these states are degenerate, any linear combination of these states will also
be an eigenstate with energy E. In particular, the symmetric and antisymmetric
eigenstates are respectively

1
ψ + (1, 2, ..., N) =
( N !)1/ 2
∑ψ
perm
a1 (1) ψ a2 (2) ...ψ aN ( N ) (5.11)

 ψ a1 (1) ψ a1 (2)...ψ a1 ( N ) 
1  ψ (1)ψ (2)...ψ ( N ) 
ψ– (1, 2, ..., N) = det.  a2 a2 a2
 (5.12)
( N !)1/ 2  ... ... 
 ψ aN (1) ψ aN (2)...ψ aN ( N ) 

where the summation is over all the permutations of the particle indices (the
normalization is different if some of the ai are the same). For the simple case of
two identical particles, the symmetric and antisymmetric states are

1
ψ± (1, 2) = [ψ a1 (1) ψ a2 (2) ± ψ a1 (2) ψ a2 (1)] (5.13)
21/ 2
In these equations, i.e. Eqs. (5.11) to (5.13), the solutions ψ+ with the plus
sign are applicable to bosons while the solutions ψ– with the minus sign apply to
fermions. These solutions are of great importance as solutions for N identical
particles, and serve as approximate starting solutions even for systems whose
Hamiltonian cannot be written in the separable form in Eq. (5.7).
An extremely important point to note in Eqs. (5.12) and (5.13), is that the
fermion wave functions vanish if any two of the ai are equal. This means that
no two identical, noninteracting fermions can be in states described by the same
set of quantum numbers. This rule was first stated for electrons in an atom by
Pauli (1925), no two electrons in an atom can have the same set of quantum
numbers n, l, ml and ms, and is known as Pauli’s exclusion principle. It is central
to the understanding of the structure of atoms. We now begin the analysis of the
structure and the energy levels of an atom, subject to the constraints of Pauli’s
exclusion principle.
Atoms and Molecules 135

5.2 SHELLS AND SUBSHELLS IN ATOMS


The Hamiltonian for the electrons in an atom with the nucleus placed at the
origin, can be written as:
H = H0 + H1 + H2 + H3 (5.14)

 Ze 2 
where H0 = ∑  21m p
i
2
i −
4π ε 0ei
+ V (ri ) 

(5.15)

1 e2
H1 = ∑
2 i ≠ j 4πε0 rij
− ∑V ( r )
i
i (5.16)

1  
H2 =
2 ∑ 2m1c r  dV
i
2 2 dr 
i
l .s
i
i i (5.17)

and H3 includes the corrections due to spin-spin interaction, relativisitic


corrections, etc. The term H0 contains the potential due to the nucleus and V(ri)
which represents some averge potential due to the other electrons (the other
electrons screen the nuclear charge). The term V(ri) is not important for small
ri but will become increasingly important as ri increases. In a simple model, the
screening contribution is assumed to be of the form

( Z − 1)e 2
V(r) = (1 − e − r / b ) (5.18)
4πε 0 r
which has the correct behaviour for r → 0 and r → ∞. Since b is expected to be
large compared to the radius of the inner orbits, the potential can be expanded in
powers of r to obtain an approximate expression for V(ri) as

( Z − 1)e 2  1 − 1 r + ... 
V(ri) ≈ b  (5.19)
4πε 0  2b 2
i

The second term H1 represents the deviations of the actual repulsive potential
from the average potential V(ri). The term H2 describes the spin-orbit interaction
of the electrons. It is of the same form as Eq. (4.51) except that Ze2/4πε0 r2 has
been replaced by the more general expression dV/dr. Of these three terms, the
general structure of the atoms is determined mainly by H0. The terms H1 and
H2, however, play an important role in the determination of the energy levels of
the atom, in particular the fine structure of the levels.
For obtaining the structure of the atoms, one starts with only the kinetic
energy of the electrons and the electrostatic interaction of the electrons with the
nucleus, i.e. the first two terms in H0 [Eq. (5.15)]. The energy levels of the
electrons are then those of a one-electron atom, i.e.
136 Elements of Modern Physics

2
m  Ze 2  1
E n(0) = − (5.20)
2  4πε 0  n 2

and the total energy is the sum of the energies of the N electrons,
N
E (0) = ∑E
i =1
(0)
n (i ) (5.21)

However, the states that can be occupied by the electrons are constrained
by Pauli’s exclusion principle. The ground-state energy is therefore obtained by
placing successive electrons in the lowest-energy, unocoupied states. It may be
noted (see Eq. (4.27)) that for each value of the principal quantum number n,
there are 2n2 states (including the factor of 2 due to the states) with the same
energy. Thus, the first two electrons are to be placed in the n = 1 states, the next
8 electrons in the n = 2 states, the next 18 electrons in the n = 3 state, etc.
Electrons with the same value of n form what are known as shells which are
designated by the letters K for n = 1, L for n = 2, M for n = 3, etc.
It may be recollected (Sec. 4.1) that the degeneracy of the different l states
(with l ≤ n – 1) for a given value of the principal quantum number n, is a special
property of the 1/r potential. The average potential V(ri), arising from the
interaction with the other electrons will remove this degeneracy and states with
different l value but the same n value, will have different energies, Since V (ri)
is positive and becomes more important as ri increases, it may be expected that
the states with larger l values will be raised more than those with smaller l
values. Explicit perturbative calculations can be made for the first two terms of
the potential V(ri) given in Eq. (5.19). From Eq. (3.125),

( Z − 1)e2 1 a1 2 
En, l ≈ En(0) +
4πε0  b − 4b 2 Z (3n − l (l + 1))  (5.22)

where Eq. (4.29) has been used for 〈 r 〉, a1 being the radius of the first Bohr
orbit with Z = 1. It is seen here that the screening effects due to other electrons
remove the l-degeneracy, the energies now increasing as l increases. This implies
that each shell is made up of subshells that have the same n value but different
l values, the subshells with larger l values having higher energy. Indeed, it so
happens that the energy of a subshell with sufficiently large l may be higher
than that of another with larger n but a lower l. The relative positions of the
various energy levels which follow from detailed calculations, and also from
experimental observations, are shown in Fig. (5.1) and form the basis of the
shell structure of the atoms.
Atoms and Molecules 137

Fig 5.1 Schematic illustration of the energy levels of atomic subshells


and the order in which the subshells are filled (given by the arows).

5.3 PERIODIC TABLE


For determining the manner in which the electrons in an atom are distributed
among the states indicated in Fig. 5.1, two rules must be kept in mind: (i) In the
ground state of the atom, the electrons occupy the lowest energy level available,
and (ii) no two electrons can have the same quantum numbers n, l, mi and ms.
The second rule implies that the maximum number of electrons each subshell
can contain is equal to the degeneracy 2 (2l + 1) of that level, i.e. 2 in an s-shell,
6 in a p-shell, etc. The order in which the shells are filled is according to the
increasing values of their energies, and is shown in Fig. 5.1, the order being 1s,
2s, 2p, 3s, 3p, 4s, 3d, 4p, 5s, 4d, 5p, 6s, 4f, 5d, 6p, 7s. A few examples are given
here to illustrate the results.

Examples
In its ground state H (hydrogen) has an electron with n = 1, l = 0, ml = 0 and
ms = 1/2 or –1/2. This configuration is designated by 1s. The two electrons of
138 Elements of Modern Physics

the He (helium) are in states n = 1, l = 0, ml = 0 and ms = ± 1/2, the configuration


being (1s)2. The configuration of Na is (1s)2 (2s)2 (2p)6 3s, while that of Sr is
(1s)2 (2s)2 (2p)6 (3d)2 (3p)6 (3d)10 (4s)2 (4p)6 (5s)2 the total number of electrons
being 38.
The electronic configurations are summarized in Table 5.1, showing the
order in which the subshells are filled. The exceptions (shown in bold type)
along with the configuration of the unfilled shells are Cr–(4s) (3d)5, Cu – 4s
(3d)10, Nb – 5s (4d)4, Mo–5s (4d)5, Ru–5s (4d)7, Rh –5s (4d)8, Pd – (4d)10 and
no electrons in the 5s shell, Ag–5s (4d)10, La – (5d) and no electrons in the 4f
shell, Gd–5d (4f)7, Pt – 6s (5d)9 and Au–6s (5d)10. Apart from these, of the
heavy elements (Z = 89 to 102), Ac, Pa, U, Np, Pu, Am, Cm, Bk have the
configuration 6d(5f)Z–89. The has the configuration (6d)2 with no electrons in 5f
shell, while Cf, E, Fm, Md, No have the configuration (5f)Z–88 with no electrons
in the 6d shell. Some important consequences of the electronic configurations
of the atoms are now discussed.
Ionization Potential
Ionization potential of an atom is the least energy required to ionize the atom,
and hence is also the binding energy of the least strongly bound electron. This
electron would be expected to belong to the last shell (filled or unfilled). In
Fig. 5.2 is shown the variation of the binding energy as Z changes, which can be
understood from the following simple considerations.

Fig. 5.2 Ionization potential in eV, as a function of Z.


Atoms and Molecules 139

Table 5.1 Electronic structure of elements

Subshell being Range of Sequence of elements


filled Z
1s 1–2 H, He
2s 3–4 Li, Be
2p 5–10 B, C, N, O, F, Ne
3s 11–12 Na, Mg
3p 13–18 Al, Si, P, S, Cl, Ar
4s 19–20 K, Ca
3d 21–30 Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn
4p 31–36 Ga, Ge, As, Se, Br, Kr
5s 37–38 Rb, Sr
4d 39–48 Y, Zr, Nb, Mo, Tc, R, Rh, Pd, Ag, Cd
5p 49–54 In, Sn, Sb, Te, I, Xe
6s 55–56 Cs, Ba
4f 57–70 La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy,
Ho, Er, Tm, Yb
5d 71–80 Lu, Hf, Ta, W, Re, Os, Ir, Pt, Au, Hg
6p 81–86 Tl, Pb, Bi, Po, At, Rn
7s 87–88 Fr, Ra
5f–6d 89–102 Ac, Th, Pa, U, Np, Pu, Am, Cm, Bk, Cf,
E, Fm, Md, No

The shells are filled in the order shown, the exceptions being shown in bold type. For
the heaviest elements (Z = 89 to 102), the electrons are in both 5f and 6d subshells.
Consider an atom with Z electrons, i of while are in the last subshell. Then
the potential seen by an electron in the last shell will be that due to the nucleus
(with charge Z| e |) screened by Z-i electrons, i.e. essentially an attractive
potential due to charge i | e |. Therefore, the ionization potential (or the binding
energy of an electron in the last subshell) may be expected to increase as
i increases. This trend is generally observed, with the ionization potential being
a minimum for atoms with only one electron in the last shell, e.g. Li, Na, K, Ga,
Rb, and a maximum for atoms with the last shell being complete, e.g. He, Ne,
Ar, Kr, and Zn (less prominent). Of course, these arguments are very qualitative.
140 Elements of Modern Physics

The situation becomes ambiguous if some of the subshells have approximately


the same energy, of interpenetrate, and would require a more careful analysis.
Chemical Properties
The chemical properties of elements are related to the forces between the
atoms. In this connection it is seen that (i) the atoms with slightly filled outer
shells, are characterized by a small ionization energy. They can easily lose the
electrons in the outer shell, and form positive ions. These elements are chemically
active. (ii) Atoms with only a small number of vacancies in the outer shell (i.e.
with almost-filled outer shell), though their net charge is zero, can easily
accommodate electrons in the unfilled shell since the screening of the nucleus is
mainly due to the inner shells. For example, Cl can bind an additional electron
with a binding energy of 3.80 eV (even the hydrogen atom can bind an additional
electron with a binding energy of 0.75 eV). With the acceptance of electrons,
these elements form negative ions. They are chemically active. (iii) Since the
chemical activity is related to the number of electrons in the outer shell, there
exist groups of atoms with similar outer shells but with different inner shells,
which have similar chemical properties. This is the origin of the periodic table
(see Table 5.2) where the atoms are arranged in such a way that the atoms in
vertical columns have similar outer shells and hence similar chemical properties.
Examples of these are (i) Noble gases, helium, neon, argon, krypton, xenon and
radon. They have closed outer shells. Their ionization potential is large, and they
cannot easily form positive ions. Nor can they bind an additional electron which
has to go into the next shell and hence would hardly experience any attraction.
These atoms are therefore chemically quite inactive. (ii) Alkali metals, lithium,
sodium, potassium, rubidium, caesium and francium. They have only one
s-wave electron in the outer shell. Since the electron has a small binding energy,
it can be easily lost. These atoms are positive monovalent, and are chemically
active. (iii) Halogens, fluorine, chlorine, bromine, iodine and astatine with five
electrons in the outer p-shell. They have an affinity for an additional electron
which would complete the shell. The atoms are negative monvalent and are
chemically active. (iv) Especially interesting are the groups of ten elements that
correspond to progressive filling of an nd subshell but have a complete (n + 1)s
subshell.
Table 5.2 Periodic table of elements (deviations from the shown shell structure are indicated by encircling the elements).

IA IIA IIIA IVA VA VIA VIIA VIIIA IB IIB IIIB IVB VB VIB VIIB VIIIB

1 2
1s H He
Atoms and Molecules

3 4 5 6 7 8 9 10
2s Li Be 2p B C N O F Ne
11 12 13 14 15 16 17 18
3s Na Mg 3p Al Si P S CI Ar
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
4s K Ca 3d Sc Ti V Cr Mn Fe Co Ni Cu Zn 4p Ga Ge As Se Br Kr
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
5s Rb Sr 4d Y Zr Nb Mo Tc Ru Rh Pd Ag Cd 5p In Sn Sb Te I Xe
55 56 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
6s Cs Ba 5d Lu Hf Ta W Re Os Ir Pt Au Hg 6p Tl Pb Bi Po At Rn
87 88 89 90 91 92 93 94
7s Fr Ra 6d Ac Th Pa U Np Pu

1 2 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6

1 2 3 4 5 6 7 8 9 10 11 12 13 14
57 58 59 60 61 62 63 64 65 66 67 68 69 70
4f La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb
141
142 Elements of Modern Physics

Since they all have a similar outer shell, their chemical properties also are
similar. The three series of elements are 21 ≤ Z ≤ 30 for n = 3, 39 ≤ Z ≤ 48 for
n = 4, and 71 ≤ Z ≤ 80 for n = 5. The partially filled inner 3d shell allows some
of these elements to have large magnetic moments. In particular, Fe, Co and Ni
are found to be ferromagnetic materials (see Sec. 8.6). (v) The rare-earths are
a group of 14 elements corresponding to a progressive filling of the 4f subshell,
though the 6s subshell is already complete. These elements have 57 ≤ Z ≤ 70.
In summary, the electronic structure of atoms provides an insight into their
properties.

5.4 ATOMIC SPECTRA


So far only the general features of the electronic structure of an atom have
been considered. These follow from the model in which only the interaction of
the electrons with the nucleus (which produces the single-electron levels with
l-degeneracy) and a term which represents an average interaction between the
electrons (which breaks the l-degeneracy) are included. In this model, the energy
of the atom is the sum of the energies of the electrons. Since the model ignores
the finer details of the interaction, the various energy levels are highly degenerate.
For example, a level with one electron in the 4p shell and another in the 3d shell
will have a degeneracy of 6 × 10 corresponding to a degeneracy of 2(2l + 1) for
each electron. This degeneracy is partially removed if the interaction between
the electrons and their spin-orbit interaction, i.e., the terms H1 and H2 in
Eq. (5.14) are included.
In this section, the effect of including the finer details of the mutual interaction
between the electrons (the term H1) and the spin-orbit interaction (the term H2)
will be considered. This will allow the characterization of different energy levels
for a given shell configuration, in particular, the ground state. The purpose will
be essentially to enumerate the various energy levels, and to some extent predict
the ordering of the energy levels (the determination of the energies will require
a much more elaborate calculation). These results will be useful not only for
predicting the multiplicity of the spectral lines, but also for describing the behaviour
of the atoms under different physical conditions, e.g. in the presence of an
external magnetic field.
Once the mutual interaction and spin-orbit interaction terms are included,
the total orbital angular momentum is no longer conserved, nor is the total spin
angular momentum. However, since the interaction is a scalar, the total angular
momentum J = L + S is still conserved (i.e. [L, H] ≠ 0, [S, H] ≠ 0, but
[J, H] = 0), and the energy eigenstates can be designated by their total angular
momentum. We will now use l, s and j to designate the individual electron
angular momenta and L, S and J to designate the sums of the angular momenta.
The problem of determining the multiplicity of energy levels corresponding to a
Atoms and Molecules 143

given shell then reduces to the problem of determining the different possible
total angular momentum states for a given shel configuration. In this context, it
is noted that the total angular momentum of an assembly of electrons forming a
complete shell is zero. This follows from the observation that the z components
of the total orbital angular momentum and the total spin momentum for the
assembly, are both zero, i.e.
l

∑m
ml = – l
l =0

1/ 2


ms = –1/ 2
ms = 0 (5.23)

and therefore

∑ m = ∑ (m
shell
j
shell
l + ms ) (5.24)

=0
Since the z-axis can be taken along any arbitrary direction, this implies that
the total angular momentum is zero. Therefore, the total angular momentum of
an atom is due to contributions from only the unfilled shells.
A detailed perturbative calculation of the contribution of H1 and H2 gives
the important result that the atoms fall into two main categories:
1. For most atoms the residual mutual interaction between the electrons, i.e.,
H1, is more important than the spin-orbit interaction represented by H2.
This situation is treated as LS coupling of Russel-Saunders coupling.
2. For some atoms, mainly heavy atoms with large unclear charges, the spin
orbit interaction, i.e. H2, is more important than H1. This is treated as j-j
coupling.
Russel-Saunders or LS Coupling
In this scheme, the spin-orbit interaction is first neglected. Since H1, being
independent of S, commutes with S and therefore also L, the energy eigenstates
can be designated by L and S quantum numbers. Having determined the different
states and their qualitative ordering, the effect of spin-orbit interaction can then
be introduced as a small perturbation to deduce the final multiplicity of the energy
levels.
For determining the ordering of the energy eigenstates, it is noted that for
the state with the largest total spin, the spins are essentially parallel to each
other and therefore the spatial wave function will be antisymmetric under the
144 Elements of Modern Physics

exchange of spatial coordinates. There is a tendency for the electrons to avoid


each oher, thus minimizing the mutual repulsion. As a result, the lowest energy
state is generally the state with the largest total spin S. Extending these qualitative
results, it is deduced that as the total spin, decreases, the energy of the states
increases. Similar arguments can be used to determine the ordering of the
L levels for a given S. Of the states with different L values, the state with the
largest L will correspond to electrons moving in the same sense, i.e. they can
avoid coming together. Therefore, the ground state has the largest possible
L value compatible with the largest possible S value. This is known as Hund’s
rule. In addition, the energy of the states for a given S increases as L decreases.
Finally, the degeneracy of the states with different J values but the same L and
S values is removed by the spin-orbit interaction, i.e. H2, whose effect can be
represented by (this can be rigorously justified by using the theorem in Sec. 6.2),
H = C L.S
2 LS
(5.25)
where CLS is constant for a given L and S. This term gives a perturbative
contribution to the energy,

1
∆EJ = C [ J ( J + 1) − L ( L + 1) − S ( S + 1)] (5.26)
2
where the subscripts of C have dropped. It is found that the constant C is
positive for multiplets formed from a subshell that is half or less than half-filled,
and negative for multiplets formed from a subshell which is more than
half-filled. Therefore, for a subshell which is half-filled or less, the energy within
the multiplet increases as J increases, and for a subshell which is more than
half-filled, the energy within the multiplet decreases as J increases. This is
known as the multiplet rule. In partiular, this means that the ground state of an
atom has the smallest J value subject to Hund’s rule if the subshell is half-filled
or less, and the largest J value if the subshell is more than half-filled. The
separation between the levels with values J + 1 and J (but the same L and S) is
obtained as
EJ +1
– EJ = C (J + 1) (5.27)
This is known as the Lande interval rule which states that the spacing
between consecutive levels of a fine-structure multiplet is proportional to
the larger of the two J values of the levels. These ideas are made explicit by
the following two examples.
Consider an atom with two valence electrons, one in the ns state and the
second in the n′ l state where l ≠ 0. This state has a total degeneracy of
4(2l + 1) corresponding to two states for the s electron and 2(2l + 1) states for
the l electron. The total wave function is a product of the spatial part
un,n′,L (r1, r2) = Rn (y1) Y00 (θ1, φ1) Rn′ (r2) Ylm (θ2, φ2) (5.28)
and the spin part
Atoms and Molecules 145

v = v ms (1) v m 's (2) (5.29)

It is clear that the total orbital angular momentum of the state u corresponds
to L = l. The state can be symmetrized or antisymmetrized with respect to the
two electrons, so that there are two states with L = l given by

1 
u±n,n′, l = un , n ', l (r1 , r2 ) ± un , n ', l (r2 , r1 )  (5.30)
21/ 2 

For the spin part, since both the electrons have s = 1/2, the allowed values
of S are S = 1, 0. It can be shown that the three symmetric states
m m's m m's
v 1 + = v s (1) v (2) + v s (2) V (1) (5.31)

correspond, except for normalization, to the S = 1 states, while the antisymmetric


state
v 0 – = v1/2 (1) v–1/2 (2) –v1/2 (2) v–1/2 (1) (5.32)
corresponds (except for normalization) to the S = 0 state. The total allowed
antisymmetric wave functions ψL, s are,
ψl, 0 = u+n,n′, l v0– (5.33)
which are 2l + 1 in number, and
ψ l,1 = u–n, n′, l v1+ (5.34)
which are 3(2l + 1) in number. Our earlier discussion indicates that ψl, 1 states
(which have the largest allowed S value, S = 1) have lower energy. The ψl, 1
states can have J = l + 1, l, l – 1. The degeneracy between these states is
removed by the spin-orbit interaction [see Eq. (5.26)], such that the states with
larger J values have greater energy. The resulting energy levels are shown in
Fig. (5.3), and there are a total of 4(2l + 1) states. These states are characterized
by the notation (2S + 1) LJ, e.g. if l = 1, the singlet state is 1P1 while the triplet states
are 3P2,1,0. The case l = 0 needs special consideration. For l = 0 but n ≠ n′, there
are only the S = 0, 1 leels which also correspond to J = 0 and 1 respectively.
They are denoted by 1S0 and 3S1 respectively. If l = 0 and n = n′, there is only
one level with S = J = 0 (S = 1 is not allowed by the exclusion principle) and is
represented by 1S0.
146 Elements of Modern Physics

Fig. 5.3 Schematic illustration of the fine-structure splitting of a


level with ns and n′l (l ≠ 0) electrons. The states are split into
S = 1 and S = 0 states by the electrostatic interaction between
the electrons. The S = 1 state is further split into
J = l – 1, l, l + 1 by the spin-orbit interaction.

The second example considered is an atom with one valence electron in the
np state, and the second in the n′l state, the total degeneracy being 12(2l + l).
Since l = 0 case was considered in the first example, it is assumed here that
l ≠ 0, and also that l ≠ 1. Then the spatial wave functions are still given by
Eq. (5.30) except that the allowed values of the angular momentum quantum
numbers now are l + 1, l, l – 1, i.e.

1
u±n′, n, L = un ,n ',L (r1 , r2 ) ± un ,n ', L (r2 , r1 )  , L = l + 1, l , l − 1
21/ 2 
(5.35)
The total wave functions are given by Eqs. (5.33) and (5.34) except that
±
u , l are replaced by u±n, n′, L, with L = l + 1, l, l – 1. The spin-orbit interaction
n,n’n′
removes the J degeneracy, and the final energy levels are shown in Fig. 5.4.
As before, they are described by the notation (2S+1)LJ. If l = 1 but n ≠ n′, there
is only one level corresponding to L = l – 1, for each S, with J = 0 for S = 0, and
J = 1 for S = 1. The other levels, i.e. L = l, l + 1 are singlets or triplets, as
shown in Fig. (5.4). Finally, the case of l = 1 and n = n′ requires a special
treatment. In this case, the allowed values of L are L = 2, 0 for un+, n, L and
L = 1 for u–n, n, L. Thus, the S = 0 state has L = 2, 0 states associated with it,
while the S = 1 state has L = 1 associated with it. However, the S = 1 state
splits into J = 2, 1, 0 states because of the spin-orbit interaction, for which the
energy increases with J.
Atoms and Molecules 147

Fig. 5.4 Schematic illustration of the fine-structure splitting of a level with


np, n′l (l > 1) electrons.
j–j Coupling
While the LS coupling scheme is applicable to the fine structure of most
atoms, there are some atoms, mainly the heavy atoms with large nuclear charges,
for which the spin-orbit interaction, i.e. H2, is more important than the residual
electrostatic interaction, i.e. H1. For these atoms, the j-j coupling scheme is
used.
In this case, the original single-particle level is split into two levels
corresponding to ji = li ± 1/2 (except for li = 0 for which there will be only the
ji = 1/2 level) by the spin-orbit interaction H2 such that
1
∆Ei = C [ j ( j + 1) − li (li + 1) − 3/ 4] ,
2 nili i i
j i = li ± 1/2, li ≠ 0 (5.36)
where Cn l is the expectation value of the coefficient of li.si in H2. The many-
i i

electron state with a given set of ji contains states with different J values, which
are degenerate. This degeneracy is removed by the residual electrostatic
interaction between the electrons. The final levels are characterized by the
quantum numbers ji and the total J. A schematic illustration of the levels given in
Fig. 5.5 for the fine-structure splitting of the levels with one electron in the np
state and another in the n′l state (l ≥ 2). These levels are generally represented
by np n′l (j1, j2)J. For example, the ground state is npn′l(1/2, l – 1/2)l–1. It is to
be noted that if l = 0, the only allowed value of j2 is is 1/2, so that J = 1, 0 for
j1 = 1/2, j2 = 1/2 and J = 2, 1 for j1 = 3/2, j2 = 1/2. For l = 1 but n′ ≠ n, one has
148 Elements of Modern Physics

only J = 2, 1 for j1 = 3/2, j2 = 1/2. Finally, for l = 1 and n′ = n, the electrons are
in the same subshell. There is only one set of antisymmetric states corresponding
to j1 = 1/2, j2 = 3/2 and j1 = 3/2, j2 = 1/2. Furthermore, the Pauli exclusion
principle restricts the allowed states to (3/2, 3/2)2,0, {(3/2, 1/2), (1/2, 3/2)}2,1
and (1/2, 1/2)0 for (j1, j2)J, where the last state is the ground state. It should be
observed that the number of final levels and the allowed J values are the same
in both the LS coupling scheme and the j-j coupling scheme [compare Figs.
(5.4) and (5.5)].

Fig. 5.5 Schematic representation of the fine-structure


splitting of the np, n′l level in the j-j coupling scheme.

As far as the applicability of the LS or the j-j coupling schemes is concerned,


it is noted that the LS coupling scheme is applicable for the lighter elements and
the j-j coupling scheme is valid for the heavier elements, whereas the elements
in-between have to be studied under the conditions of intermediate coupling
(H1 and H2 of comparable strength). A good example is that with the two
p electrons being in the ground state: carbon is described by the LS coupling
scheme, lead by the j-j coupling scheme, whereas silicon, germanium and tin
fall in the category of intermediate coupling.
Selection Rules
The most prominent transitions between the atomic levels just discussed are the
electric dipole transitions (see Chapter 6). These transitions follow the selection
rules:
1. Transitions occur only between states in which one electron changes its
state. The l-value of this electron changes by one unit,
∆l = ± 1 (5.37)
Atoms and Molecules 149

2. The allowed changes in the quantum numbers of the whole state are

∆S = 0, ∆L = 0, ± 1 

∆J = 0, ± 1, but not J = 0 → J = 0  LS coupling (5.38)
∆mJ = 0, ± 1 

and

∆J = 0, ± 1 for the electron which 


change its state 

∆J = 0, ± 1 but not J = 0 → J = 0  j–j coupling (5.39)
∆M J = 0, ± 1 

A nice illustration of the energy levels and the allowed transitions is


provided by the mercury atom (Fig. 5.6). This atom has two valence electrons
both of which are in the 6s shell for the ground state. In the excited state,
one of the electrons will go into n′l state. The energy levels for each configuration
are essentially those given in Fig. 5.3 except for l = 0 and n′≠ n in which
case there are only two states S = J = 0 and S = J = 1, and for l = 0 and
n′ = n in which case only the S = J = 0 state exists. The energy levels
according to the LS coupling scheme and the observed transitions are shown
in Fig. 5.6 where the energies are so normalised that the energy of the
singly-ionized state is zero. It should be noted that the transition (6s) (6p)3
P1 → (6s) (6s) 1S0 violates the selection rule ∆S = 0 for LS coupling. Its
observation is due to the fact that all the atoms with S = 1 will go down to
the 6 3P1 level being the lowest-energy triplet state, which therefore can
have a high population density, and that the LS coupling scheme is only an
approximate scheme, i.e. the levels contain mixtures of S = 0 and S = 1
terms. The discussion for mercury can be directly extended to the helium
atom which has a ground-state configuration of (1s)2.
Some Regularities in Atomic Spectra
It is clear that the structures of atomic energy levels and their spectra,
are in general quite complicated. There are, however, some observed
regularities which can be understood in terms of the electronic structure of
the atoms.
150 Elements of Modern Physics

Fig. 5.6 Some energy levels and allowed transitions for the mercury atom
(the fine structure due to the spin-orbit interaction removing
the J degeneracy is not shown).

1. Rydberg series: As mentioned in Sec. 4.6, an atomic electron in an highly


excited state is influenced mainly by the net charge of the ion. Consequently,
its energy levels a approach those of the hydrogen atom. For example, the
energy of the Na valence electron is –0.5461 eV in the n = 5, l = 2 state
and –0.5432 eV in the n = 5, l = 3 state, whereas that of hydrogenic
electron in the n = 5 state is –0.5430 eV. This means that the spectral lines
corresponding to transitions between states with large quantum numbers
approach those of the hydrogen atom. This result is implicit in the observation
of Rydberg that many of the spectral series can be expressed in the form

 1 1 
v= R 2
− , n > n' (5.40)
 (n ' − δ ') ( n − δ) 2 

where δ and δ′ are small constants.


2. Spectra of a chemical group: Atoms belonging to the same chemical
group, for example, alkali atoms Na, K, Rb, Cs, have a similar valence
structure though their cores are different. It is, therefore, observed that
Atoms and Molecules 151

their spectra, particularly their fine structure, corresponding to transitions


between states with same quantum numbers, are qualitatively similar.
3. Isoelectronic sequences: Atoms and ions with the same electronic
configuration form what is known as an isoelectronic sequence, e.g.
A, K+, Ca++. The members of the sequence differ only in the charge of the
nucleus. As such, their spectra will be similar, except that their frequencies
increase systematically with the increase in the charge of the nucleus.
This was noted in Sec. 4.6 for H, He+, Li++.

5.5 X-RAY SPECTRA


In Sec. 5.4, atomic spectra governed by the outer or valence electrons were
discussed. At the other extreme are the x-ray spectra which are produced by
the electrons in the inner shells of heavy atoms. For these electorns, the interaction
with the nucleus dominates over other interactions, which allows a relatively
simple description of the x-ray spectra.
X-rays are produced when solid targets are bombarded with fast electrons.
An x-ray tube consists of an evacuated bulb in which a heated cathode serves
as a source of electrons (thermionic emission). The electrons are accelerated
through a potential difference of the order of 50 kV, and made to strike the
anode made of heavy metals (Cu, W, Pt, etc.). While most of the electron
energy is liberated at the anode in the form of heat (the anode therefore, has
to be cooled), about 1 to 3% of it is converted into high frequency radiation–
the x-ray radiation which has a wavelength of the order of 10–10 m.
The spectrum of x-rays from an x-ray tube, is a superposition [see Fig. (5.7)]
of a continuous spectrum called white radiation and a line spectrum called
characteristic spectrum (it characterizes the material of the anode).

dI
dn

0 n nmax = eV/h

Fig. 5.7 The spectral distribution of intensity per unit frequency showing
two characteristic lines superposed over a continuous spectrum.
152 Elements of Modern Physics

When the high-velocity electrons reach the anode, they are subjected to
large accelerations in the vectorial sense by the strong electrostatic interaction
with the nuclei of the anode, which causes them to emit electromagnetic radiation.
This radiation due to deceleration, called bremsstrahlung, forms the continuous
radiation of the x-rays from an x-rays tube. As might be expressed from this
mechanism, the maximum energy of the radiation an electron with energy e
times V (V is the voltage difference in the tube) can emit is eV so that the
highest frequency in the continuous spectrum is
e
vmax = V (5.40a)
h
This relation povides a means of obtaining an accurate measurement of the
ratio e/h. As V increases, the intensity increases at all frequencies, and vmax
increases in proportion to V. It is also found that as the nuclear charge Z increases
(V being the same), vmax remains unaltered but the intensity increases (since the
decelerating forces increase with Z).
In contrast to the continuous spectrum, the line spectrum is independent of
the accelerating voltage V, but depends only on the material of which the anode
is made. When the fast-moving electrons strike the atoms of the anode, they
will occasionally knock out an electron in one of the inner shells, creating a
vacancy there. Subsequently, an electron from an outer shell will undergo a
transition to the vacant level by emitting a photon of energy equal to the difference
in the energies of the two levels. This gives rise to the observed characteristic
line spectrum whose frequencies depend only on the energy levels of the atoms
in the anode.
Emission Spectrum
The line spectrum of x-rays is due to transitions between states in the inner
shells of heavy metals (Z > 30), for which the nuclear interaction is dominant.
Therefore, for these states, it is reasonable to use an approximate Hamiltonian

1 2 Ze2
Hi = pi − + H′ (5.41)
2m 4 πε0 ri

( Z − 1)e2 Ze2  si .li 


H′ = [1 − exp ( − ri / b)] +  3  (5.42)
4 πε0ri 8πε0m2c2  r 
 i 
where the first term in H′ is the screening interaction given in Eq. (5.18), and
the second term is the spin-orbit interaction. Treating H′ perturbatively, [see Eq.
(3.125)], the energy levels are
Atoms and Molecules 153

( Z − 1)e2  Z e− r / b 
En,l = En(0) +  2 − 
4πε0  a1n r 

Z 2 α 2 | En (0) |  4n 
+ 3 −  (5.43)
4n 2
 j + 1/ 2 
where the last term includes the relativistic corrections discussed in sec. 4.4 and

e− r / b  Z  (n + l ) !(2b0 ) 2l + 2
=   2l + 4 (1 + 2b0/n)–2n
r a
 1n (2l + 1) ! N !
F (– N, – N, 2l + 2, 4b02/n2) (5.44)
α 2 x [α (α + 1)]2 x 2
where F(α, α, β, x) ≡ 1 + + +... (5.45)
β 1! β(β + 1) 2!
with b 0 = bZ/a 1 (a 1 is the radius of the first Bohr orbit with Z = 1),
and N = n – l – 1. This expression is quite simple for n = 1 and 2:
2
e− r / b Z  2b0 
=   for n = 1
r a1  2b0 + 1 
2 2
Z b0 (2 + b0 )
= for n = 2, l = 0 (5.46)
4a1 (1 + b0 ) 4
4
Z  b0 
=   for n = 2, l = 1
4a1  1 + b0 
A more detailed analysis (based on what is called as the Fermi-Thomas
model) indicates that the screening parameter b has the form
b = c a1 Z–1/3 (5.47)
and the experimental first give the result b0 ≈ 0.80 Z2/3.
In x-ray spectroscopy, the shells are designated by the capital letters K, L,
M, etc. corresponding to the principal quantum number n = 1, 2, 3, etc.
respectively, and the subshells by the sub indices I, II, III, etc. in the order of
increasing energy. For example, in the K shell (n = 1) there is only one level,
whereas in the L shell (n = 2) there are three subshells, LI(n = 2, l = 0), LII(n =
2, l = 1, j = 1/2) and LIII(n = 2, l = 1, j = 3/2). The energy level of the K and L
shells of some elements, obtained from the expression in Eq. (5.43) are given in
Table (5.3). The agreement between the predicted values and the experimental
values is generally good (except in the case ELII − ELI of heavy elements),
especially considering the fact that the predictions use only first order perturbative
calculations. It is important to observe that the spin-relativity separation
(e.g. ELIII − ELII )
154 Elements of Modern Physics

Table 5.3 The energy levels (in keV) of K and L shells for some atoms,
obtained from Eq. (5.43), along with the experimental values in brackets

–EK – ELI EL II − ELI EL III − ELII

Cu 8.908 0.853 0.122 0.032


(8.996) (1.104) (0.149) (0.020)
Mo 20.06 2.53 0.173 0.141
(20.04) (2.87) (0.239) (0.103)
Ag 25.56 3.46 0.192 0.221
(25.56) (3.81) (0.282) (0.173)
W 70.01 11.79 0.285 1.358
(69.64) (12.12) (0.556) (1.340)
Pb 88.24 15.51 0.310 2.048
(88.16) (15.89) (0.661) (2.170)
U 114.74 21.11 0.341 3.245
(115.80) (21.80) (0.821) (3.781)

increases rapidly as Z increases, whereas the screening separation (e.g.


ELII − ELI ) increases only gradually. This supports out earlier statement that the
spin-orbit interaction and hence the j-j coupling becomes important for large-Z
elements.
When the bombarding electrons have sufficient energy to knock out inner-
shell electrons, vacancies are created in the inner-shells. The electrons from
outer shells undergo tansitions to these vacant states emitting photons whose
energy is equal to the difference in the energies of the two levels:
hv = Ei – Ef (5.48)
Ei and Ef being the initial and the final energies of the states. This gives rise
to the observed characteristic spectrum of the x-rays. The allowed transitions
satisfy the usual selection rules for electric dipole trasitions
∆l = ± 1
∆j = ± 1, 0, but not j = 0 → j = 0 (5.49)
∆n = unrestricted
X-ray spectra are grouped into several series, the K series for transitions to
the K shell (i.e. n = 1), the L series for transitions to the L shell (i.e. n = 2), etc.
Within each series, the lines are characterized by the indices α, β, γ, etc. according
to decreasing intensity, e.g. Kα for the transition from the L shell to the K shell,
Atoms and Molecules 155

Kβ for the transition from the M shell to the K shell, etc. The ultiplets are given
an additional number index, e.g. Kα1 for LIII → K, Kα2 for LII → K, etc. The first
few allowed transitions are shown in Fig. 5.8.

MV 3d5/2
MIV 3d3/2
MIII 3p3/2
MII 3f1/2
MI 3s1/2
a2
b3 b1 a1
h b l=1+½
E b4 1–½
LIII 2p3/2
LII 2p1/2
LI 2s1/2
L Series l=0

n=2
a1
b1 l = 0, 1

a2 b2

K
K Series

Fig. 5.8 Schematic illustration of the transitions for some


K and L lines in the x-ray spectrum.
Moseley’s Law
The relation between the x-ray fequencies and the Z values of the atoms was
investigated by Moseley (1913-1914). It was found that the square root of the
frequency is essentially a linear function of Z. This is seen in the plot of (hv/
E0)1/2 against Z (Fig. 5.9), where it is also observed that the intercept on the
Z-axis is of the order of unity for the K lines. Thus,
(hv/E0)1/2 = c(Z – σ) (5.50)
where σ is called the shielding factor. This relation is known as Moseley’s
law. This law can be discussed in terms of the one-electron energy levels including
the screening effect. If (Z – σi) e and (Z – σf) e are the effective screened
charges for the initial and the final states, the frequency of the radiation for
transition between these states is given by
156 Elements of Modern Physics

70 Ka1

60 Ka2

50
1/2
(hn/E0)

40

30

20

10

0 10 20 30 40 50 60 70 80
Z

Fig. 5.9 Moseley diagram for the plot of (hv/E0)1/2 against Z for the Ka lines.

 ( Z − σ f )2 ( Z − σ )2 
hv ≈ |E0|  2
− 2
l
 ...(5.51)
 nf ni 
where E0 is the energy of the ground state of the hydrogen atom. This expression
may be written in the approximate form in conformity with the experimentally
 1 1 
hv ≈ |E0|  2 − 2  (Z – σ)2 (5.52)
 nf ni 

observed relation in Eq. (5.50). It is found that σn ≈ 1 for the ground state n = 1
and σn ≈ 7.5 for n = 2. It may also be noted that the separation between Kσ1 and
Kσ2 lines, being related to the spin-relativity separation, increases rapidly as Z
increases.
X-ray Absorption Spectrum
X-rays can pass through matter. The intensity is reduced in the process, the
reduction depending upon the nature of the material (which forms the basis of
many practical applications), and on the frequency. High frequency x-rays are
generally absorbed less than low frequency x-rays.
The amount of absorption of x-rays by a given material is studied in terms
of the mass absorption coefficient which is defined by the relation
dI = – µ ρ I dx (5.53)
Atoms and Molecules 157

where dI is the reduction in intensity while passing through a thickness dx of a


material of density ρ, I is the intensity of x-rays of frequecy v, and µ is the
frequency-dependent mass absorption coefficient. The relation can be
integrated to give
I(x) = I 0 e –µρx (5.54)
where I0 is the intensity of the incident radiation. The absorption characteristics
can be discussed in terms of the frequency dependence of µ (Fig. 5.10).
As the frequency is lowered from a high value, the absorption coefficient
µ increases (low frequency x-rays are less penetrating) gradually, till a frequency
νK is reached when it suddenly drops to a low value. As the frequency is further
lowered, µ increases, but drops again at frequency vLI. This process continues
but becomes obscured at very low frequencies. The edges in the plot of µ at the
different frequecies (Fig. 5.10) are known as the K-absorption edge, LI-absorption
edge, etc.

LI
LII

LIII
K
(a) m(n)

nL nK Frequency

L-series K-series
(b) x-ray intensity

Frequency

Fig. 5.10 A comparison of absorption and emission characteristics of the


x-rays (only a few illustrative lines are shown).
158 Elements of Modern Physics

The main contribution to the attenuation of x-rays is the photoelectric


absorption of the photons with the emission of electrons. As the frequency is
lowered beyond vK, the photon energy is not sufficient to eject a K-shell electron.
The closure of this channel lowers the absorption coefficient µ and gives rise to
the K-absorption edge. When the energy is lower than vLI, the electrons in the
LI shell also cannot be ejected, giving rise to the LI absorption edge. Thus, there
are three absorption edges corresponding to the LI, LII and LIII subshells, five
absorption edges corresponding to the M shell (see Fig. 5.8), etc. The frequencies
of these absorption edges are given by
hv = | E | (5.55)
where E is the energy of the K shell, the LI shell, etc. Comparison of this
expression with that in Eq. (5.48) for characteristic emission frequencies shows
that the frequency of the absorption edge is always greater than the corresponding
characteristic emission frequencies (this may be observed in Fig. 5.10). The
reason for this is that emission lines correspond to transition between two shells
(or subshells). On the other hand, the frequency of the absorption edge,
corresponds to a transition between the lower shell or subshell and the continum
states. In comparison with Eq. (5.51), the frequency of the absorption edges is
given by
(Z − σ f )2
hv ≈ |E0| (5.56)
nf 2
which is greater than the corresponding characteristic emission frequency in
Eq. (5.51), and is equal to the limit of the characteristic emission frequency for
ni → ∞.
Auger Effect
So far, it has been assumed that the vacancy in an inner shell, say the K shell, is
filled by an electron from an outer shell, along with the emission of a photon. It
is however found experimentally, that the fluorescence yield w defined as:
np
w= (5.57)
ne
np being the number of K photons and ne being number of K electrons knocked
out (i.e. number of K-shell vacancies), is smaller than 1, ranging from a value of
0.1 for light elements to about 0.95 for uranium.
An explanation of the above observation was found by Auger (1925) who
noted that the ejection of the K-shell electron is often accompanied by the ejection
of another electron. This is due to the fact that the electron which undergoes a
transition to the vacant K shell may knock out another electron usually from the
same shell, i.e. its initial shell. This is known as an Auger transition and the
emitted electron is known as an Auger electron. It should be emphasized that
Atoms and Molecules 159

the Auger electron is not knocked out by the photo-electric absorption of a photon
emitted by the electron which undergoes a transition to the K shell, but emerges
directly as a part of the process of readjustment of the atom. For example, the
vacancy in the K shell may be filled by an electron in the LI shell and the electron
in the LII shell may be knocked out, with the result that there will be two vacancies
in the L shell. Thus, the de-excitation of the atom may be accompanied either by
the emission of a photon (characteristic radiation) or an electron (Auger electron).
The two processes together essentially account for the number of vacancies in
the K shell. Finally, it is noted that the basic process in the Auger effect is also
known as auto-ionization or internal conversion (in nuclear transitions).

5.6 MOLECULAR BONDING


When atoms approach one another, attractive forces come into play which
generally, though not always, bind them into molecules. The mechanism
generating these attractive forces differs from molecule to molecule, but is usually
a combination (i) van der Waals forces, (ii) ionic (or heteropolar) bonds, and
(iii) covalent (or homopolar) bonds. In most cases, however, the ionic bonds or
covalent bonds are dominant. Here, a brief discussion of these mechanisms is
given for diatomic melocules. It is to be noted that the molecular forces arise
primarily from the interaction of the outer electrons.
Van der Waals Forces
When two atoms approach each other, though neutral, they induce fluctuating
but correlated electric dipole moments in each other. This gives rise to a dipole-
dipole attractive potential which is of the form – a/r6. When the atoms are so
close that the electronic orbits overlap, the Pauli exclusion principle forces them
into higher orbits which essentially brings in strong repulsive forces. The effective
potential may be written in the form:
b a
V (r) = n
− 6 (5.58)
r r
where a and b are positive numbers and n is a large number (~ 10). This represents
what is known as van der Waals interaction. Of course, when the nuclei are very
close to each other, the dominant force is the repulsion between the nuclei. However,
the forces in this region are not important for molecular bonding.
Ionic Bonds
These bonds are formed when it is energetically favourable for electrons to be
transferred from one atom to another, with the resulting ions held together by
electrostatic attraction between them. Such bonds are known as ionic or
heteropolar bonds and are exemplified by the bonds in NaCl, KBr, HCl, etc.
As a specific case, we consider the KCl molecule. The K atom has its
valence electron in the 4s shell with a binding energy of only 4.34 eV. Now, a Cl
160 Elements of Modern Physics

atom which has five valence electrons in the 3p shell, can attract another electron
(because of its incomplete shell) and bind it with a binding energy of 3.80 eV.
However, if an electron is transferred from a K atom to the Cl atom, resulting in
K+ and Cl– ions, there will be an additional electrostatic attraction between the
ions. Including the van der Waals repulsion (the – 1/r6 attraction may be neglected
as compared to the electrostatic attraction), the energy of the system is
14.4 b
E = – 3.80 – + n (5.59)
r r
where E is expressed in eV and r is in Å (the small kinetic energy of the atoms
has not been included). If this energy is less than –4.34 eV (the binding energy
of the electron in the K atom), then it is favourable for the electron from the
K atom to be transferred to the Cl atom, with the resulting ions held together by
the electrostatic attraction between them. This gives rise to ionic bondng. The
details are shown in Fig. 5.11, the system together having a minimum energy of
–8.76 eV at a separation 2.79 Å. It is observed that the dissociation energy, i.e.
the energy required to separate the KCl molecule into K and Cl atoms is (8.76
–4.34) eV, i.e. 4.42 eV.
Covalent Bonds
In some cases, the valence electrons of the atoms have no particular preference
for either of the two atoms, and are shared by both the atoms. This is especially
true in the case of identical atoms forming molecules, e.g.

r0 r(Å)
0
2 4 6 8

–2

–3.8 eV
E (in eV)

–4

–4.34 eV

–6

–8
–8.76 eV at r0 = 2.79 Å

Fig. 5.11 The plot of E in eV against the distance of separation r is Å,


between K+and Cl– ions, for the KCl molecule.
Atoms and Molecules 161

H2, O2, N2, etc. The bonds resulting from the sharing of the valence electrons
are known as covalent or homopolar bonds.
Consider a particle moving in the presence of two similar, one-dimensional,
attractive potential V1 and V2 which are centred at positions x1 and x2. If the
positions x1 and x2 are separated by a large distance d, the ground state energy
E0 will be essentially degenerate, the degenerate eigenstates being ψ1 and ψ2
which are eigenstates with only V1 or V2 being present, respectively. As the
separation distance d decreases, one may consider as possible eigenstates,

1
ψ± = (ψ1 ± ψ2) (5.60)
2
where the overlap integral is ignored in the normalization. If it is also assumed
that ψ1 is small at x2 and ψ2 is small at x1, the expectation value of the energy is

E± ≈ E0 ± ∫ ψ1* (x) V2ψ2 (x) dx (5.61)


Since the potential is attractive (V1,2 may be taken to be negative), the integral
is expected to be negative. Therefore, the ground state splits into two states, the
symmetric state ψ+ with energy E+ and the antisymmetric state ψ– with energy
E –. For the symmetric state, which has a lower energy, it is seen that the
wavefunction is larger in between the centres of potentials than it is on the
outside.
Take the above model to be reasonable, it is seen that when two hydrogen
atoms are brought together, the two valence electrons prefer to be in between
the two atoms, and have a lower energy than when they are attached to the
atoms in isolation. The equilibrium situation is reached at some finite distance of
separation (the repulsion term in Eq. (5.58) becomes important at short distances),
in which the electrons are shared by the two atoms and the electrons prefer to
be in between the atoms. This gives rise to what are known as covalent or
homopolar bonds. The equilibrium distance in the case of the H2 molecule is
about 0.74 Å and the corresponding binding energy is 4.75 eV. It is important to
note that in covalent bonds, since both the electrons prefer to be in between the
atoms, the spatial wave function is symmetric under the exchange of the two
electrons. It is, therefore, required by Pauli’s exclusion principle that the spins
must point in opposite directions and the total spin of the two electrons must be
zero. The state with parallel spins is antisymmetric in the spatial part of the
wave function, and will in general have a higher energy (see Fig. 5.12). The
antiparallel spin state is called the bonding state and the parallel spin state is
called the antibonding state. The situation is similar for other diatomic molecules.
162 Elements of Modern Physics

Antibonding state, parallel spins

ro
E (in eV)

1 2
r (in Å)

–2

Bonding state, antiparallel spins

–4
– 4.75 eV at ro = 0.74 Å

Fig. 5.12 The energy of the H2 molecule for the bonding and anti-bonding states.

5.7 MOLECULAR SPECTRA


Ikn sec. 5.6, the interatomic forces which lead to molecular bonding were
discussed. Apart from the ground state, a molecule can be in a higher energy
state, and transitions between the various energy levels give rise to the observed
molecular spectra.
The energy of a diatomic molecule arises from different modes: (i) The
electronic configuration of the electrons in the molecule, (ii) the vibration of the
atoms about the equilibrium position, and (iii) the rotation of the molecule about
its centre of mass. For an approximate consideration of molecular spectra, the
total energy may be written as a sum of three components, and their excitations
treated independently:
E = Ee + Ev + Er (5.62)
Electronic excitations involve the largest changes in energy, but are also the
most difficult to deduce from theory. Here only the vibrational and rotational
energies of the molecule will be considered.
If the molecule is treated as consisting of two point masses, the nuclei, the
energy in Figs. 5.11 and 5.12 for the bonding state, may be regarded as the
potential energy of these point masses. Expanding this potential energy about
the minimum,
Atoms and Molecules 163

1 d 2V
V(r) = V0 + (r – r0)2 + ... (5.63)
2 dr 2 r = r0

where the constant term only defines the zero of the energy of the system.
Neglecting the higher order terms in the expansion, the Hamiltonian for vibrational
and rotational motion is
1 2 1 2 1
Hvr ≈ pr + J + k (r – r0)2 (5.64)
2M 2I 2
where the first term is the kinetic energy of the vibrational motion (M is the
reduced mass) and the second term is that of the rotational motion J being the
rotational angular momentum (I is the moment of inertia about the centre of
mass), and k is d2V/dr2. The energy eigenvalues of this Hamiltonian are easily
obtained from Eqs. (3.120) and (3.156), leading to the total energy E,
1/ 2
 1  k  2
E = Ee +  n + 
   + J(J + 1),
 2  M  2I
n = 0, 1, 2,..., J = 0, 1, 2,... (5.65)
When the molecule undergoes a transition, there is a change in the energy
of the state. In an emission process, the frequency of the photon is given by
1/ 2
k 2 2
hv + Ee –Ee′ + (n – n′)    + J ( J + 1) − J′ (J′ + 1) (5.66)
M  2I 2I
It is found, both from theory and experiments, that the separation between
electronic energy levels is of the order of 5 eV while that between vibrational
energy levels is about 1 eV and that between rotational levels is about 10–5–10–3
eV. Therefore, for weak excitations, only changes in rotational states are observed
whereas changes in vibrational and electronic states require stronger excitations
to be observed. Here, we will confine out discussion to changes in the rotational
and vibrational states (symmetric molecules such as H2 requires a special
treatment).
Selection Rules
The electric dipole transitions (which are the most prominent transitions) for
vibrational and rotational states, satisfy the selection rules
∆n = 0, ± 1
∆J = ± 1 (5.67)
For transitions with ∆n = 0, the emission frequency is given by

2 2
hv = J ( J + 1) − (J – 1) J
2I 2I
164 Elements of Modern Physics

2
= J, J 1,2,... for ∆n = 0 (5.68)
I
This gives a band of spectral lines (Fig. 5.13), known as the rotational band,
with equally spaced frequencies

 h 
v=  2  J (5.69)
 4π I 
The spacing is of the order of 1012 s–1, which falls in the very far infrared
region. The spacing allows us to evaluate I and hence the equilibrium distance
[r0 = (I/M)1/2]. For HCl, the spacing is ∆v ≈ 6.2 × 1011 s–1 which gives the
values of I ≈ 2.7 × 10–47 kg. m2 and therefore r0 ≈ 1.29 Å.
For transitions with ∆n = 1,
h2
hv =  (k/M)1/2 ± J, J = 1, 2,... for ∆n = 1 (5.70)
I
where the plus sign is for J′ = J – 1 and the minus sign is for J′ = J + 1. This
again gives us a band of spectral lines which have the same spacing as the lines
1
in the rotational band, but with the centre at v0 = (k/M)1/2 (which has a

value of above 8.67 × 1013 s–1 for HCl) and with the central frequency missing.
This is known as the vibrational-rotational band (Fig. 5.13).
Symmetric Molecules
Symmetrical molecules do not have an electric dipole moment and the associated
dipole transitions, and hence do not exhibit the pure rotational or vibrational-
rotational bands just described. The changes in their states are due to higher-
order effects, so that the radiation emitted is much weaker. These higher-order
transitions obey the selection rules:
∆J = 0, ± 1, ± 2 (5.71)
Since the nuclei of a symmetrical molecule are identical, the total nuclear
wave function must satisfy the requirements of exchange symmetry, i.e. the
total wave function must be symmetric for an integral nuclear spin I, and
antisymmetric for an half-integral nuclear spin I, under the interchange of the
nuclei. The exchange symmetry of the spatial part of the wave function is
determined by the rotational states (i.e. the Yl m (θ, φ) functions) which for the
exchange of the nuclei (i.e. θ → π – θ, and φ → π + φ) are even for even l and
odd for odd l. Here l plays the role of J. Of the spin states of nuclei with spin I,
there are (I + 1) (2I + 1) states which are even and I (2I + 1) states which are
odd, under the exchange of spin. States with even spin state are called ortho-
modifications while those with odd spin state are called para-modifications.
Atoms and Molecules 165

For nuclei with an integral I, the ortho states are associated with even l values
and the para states with odd l values, while for nuclei with an half-integral I, the
ortho states are associated with odd/values and the para states with even
l values. Since nuclear spin has only a weak interaction, it does not change in a
normal transition.

J=3

n=1 2
1
0

J=3

n=0 2
1
0

2n1
Line
Spectrum
n1 3n1 n0
n1 = n1J n = n0 ± n1J
(a) (b)

Fig. 5.13 Rotational (a) and (b) vibrational–rotational spectral lines


(spacing of rotational levels is greatly enlarged).
The above discussion implies that the transitions for symmetric molecules
take place only with ∆J even, which in view of Eq. (5.71) implies ∆J = 0, ± 2.
Since the transition frequencies increase with J, and the transitions alternate
between ortho-transitions and para-transitions as J increases, the spectral bands
contain lines which alternately arise from ortho-modifications and para-
modifications. As a result, the intensities of the lines also alternate (for I = 0, one
has only the ortho-modification and the alternate lines will be missing e.g. 02).
The intensities at ordinary temperatures are proportional to the number of states
in the two modifications and hence the ratio of the two intensities is
N para I
= (5.72)
N ortho I +1
166 Elements of Modern Physics

This relation can be used to determine the nuclear spin.


For the hydrogen atom, I= 1/2. The ortho-modification has I = 1 (with
3 states) and the para-modification has I = 0 (with 1 state). At room temperature
the two modifications will occur in the ratio of 3 : 1. However, at low temperatures
most of the hydrogen molecules will go into the para state which has a slightly
lower energy, so that its spectral band will have alternate lines missing. When
para-hydrogen is heated, it retains its modification for a long time (several
weeks). As the modification changes, the spectral bands, as well as some physical
characteristics such as the heat capacity, show interesting variations.

5.8 EXAMPLES
Here, a few examples will be discussed to illustrate and extend some of the
ideas introduced in this chapter.

Example 1
Hund’s rule can be used to deduce the ground state of the elements. In particular,
consider the period from Na to Ar.
Sodium has one 3s electron in the valence shell, and hence its ground state
is 2S1/2. Magnesium has (3s)2 in the valence shell. Since this corresponds to a
closed subshell, its ground state is 1S0. For Al, there is one electron in the
3p subshell so that its ground state is 2p1/2 (the smallest J value allowed is 1/2).
For Si, the valence shell has (3p)2 so that the ground state has S = 1. The largest
allowed orbital angular momentum has L = 1 (since the space part is
antisymmetric, the largest ML corresponds to the two electrons having ml = 1
and ml = 0, so that the largest value of L is 1). Thus, the ground state is 3P0
(smallest value of J is zero). For P, the valence shell is (3p)3 so that the ground
state has S = 3/2. The only allowed value of L is L = 0 (the antisymmetric spatial
wave function corresponds to electrons having ml = 1, ml = 0, and ml = – 1).
Therefore, the ground state is 4S3/2.
For sulphur the shell is more than half-filled. Since a closed shell has J = L
= S = 0, it is easier to consider the unfilled shell as hole states (two holes for
sulphur). As in the case of holes in the Dirac sea, these holes may be regarded
as having positive charge. The spin of the two-hole state for sulphur is S = 1, the
orbital angular momentum is L = 1 (as for the two electron state), and J = 2 (the
holes have positive charge so that the constant C is Eq. (5.26) is negative and the
ground state has the largest allowed J value). Therefore, the ground state is denoted
by 3P2. for Cl, there is only one hole which gives for its ground state, 2P3/2 (largest
J value). Finally or Ar, the subshell is closed, giving its ground state as 1S0.
As an example of two unfilled subshells, consider molybdenum whose unfilled
shells are (4d)2 (5s). The largest spin has S = 3, and the only allowed value of L
Atoms and Molecules 167

is 0 (ml = 2, 1, 0, –1, –2 for the five d-shell electrons) so that the ground state
is 7S3.

Example 2
In Sec. 5.4, it was shown that the number of J states for a two-electron system
is the same in LS and in j-j schemes, if one of the electrons has l = 0 or 1. This
result is now extended to l > 1.
Let 1 < l2 < l1 for electrons 1 and 2. The allowed values of L are L = l1 +
l2,..., l1 – l2, while the allowed values of S are S = 0, 1. The corresponding J
values in the LS coupling scheme are:
S = 0 : J = l1 + l2, ..., l1 – l2
S = 1 : J = l1 + l2 + 1, ..., l1 – l2 + 1 (5.73)
J = l1 + l2, ..., l1 – l2
J = l1 + l2 – 1, ..., l1 – l2 – 1
In the j-j coupling scheme, the allowed values of j1 and j2 are l1 ± 1/2 and
l2 ± 1/2 respectively. Therefore, the allowed J values are:
j1 = l1 + 1/2, j2 = l2 + 1/2 : J = l1 + l2 + 1, ..., l1 – l2
j1 = l1 + 1/2, j2 = l2 – 1/2 : J = l1 + l2, ..., l1 – l2 + 1
j1 = l1 + 1/2, j2 = l2 + 1/2 : J = l1 + l2, ...,l1 – l2 – 1
j1 = l1 + 1/2, j2 = l2 – 1/2 : J = l1 + l2 + 1, ..., l1 – l2
(5.74)
It is observed that each J is repeated the same number of times in Eq. (5.73),
i.e. the LS coupling scheme, and in Eq. (5.74) i.e. the j-j coupling scheme. For
two inequivalent electrons but with l1 = l2 > 1, the allowed J values are those
given in Eqs. (5.73) and (5.74) except that the state with J = l1 – l2 – 1 is not
allowed.
For equivalent electrons, l1 = l2 > 1, the allowed J values in the LS coupling
scheme are:
S = 0 : J = l1+ l2, l1 + l2 – 2, ..., 0
S = 1 : J = l1 + l2, l1 + l2 – 2, ...,2 (5.75)
J = l1 + l2 – 1, l1 + l2 – 3, ..., 1
J = l1 + l2 – 2, l1 + l2 – 4, ..., 0
In the j-j coupling scheme, the allowed j1 and j2 values are j1 ± 1/2 and
j2 ± 1/2 respectively. Therefore, the J values are
168 Elements of Modern Physics

j1 = l1 + 1/2, j2 = l2 + 1/2 : J = l1 + l2, l1 + l2 – 2, ..., 0

j1 = l1 + 1/ 2, j2 = l2 − 1/ 2 
 : J = l1 + l2 , l1 + l2 − 1, ...,1
j1 = l1 − 1/ 2, j2 = l2 + 1/ 2 

j 1 = l1 – 1/2, j2 = l2 – 1/2 : J = l1 + l2 – 2, l1 + l2 – 4, ..., 0


(5.76)
It is again observed that each J is repeated the same number of times in the
LS coupling scheme and n the j-j coupling scheme.

Example 3
The spin-orbit interaction splits the levels of the LS couplng scheme into multiplets.
The multiplet structure of the first few observed lines in mercury is as follows:
The triplet levels are split into (6s) (np) 3P2,1,0, (6s) (nd) 3D3,3,2,1, etc.
wereas (6s) (ns) 3S1 has only one level. The allowed transitions are:
(6s) (6p) 1P1 → (6s) (6s) 1S0, λ = 1849.6 Å
(6s) (6p) 3P1 → (6s) (6s) 1S0, λ = 2536.5 Å
(6s) (7s) 1S0 → (6s) (6p) 1P1, λ = 10,139.7 Å (5.77)
(6s) (7s) S1 → (6s) (6p) P0, λ = 4046.6 Å
3 3

(6s) (7s) 3S1 → (6s) (6p) 3P1, λ = 4358.4 Å


(6s) (7s) 3S1 → (6s) (6p) 3P2, λ = 5460.7 Å
the other transitions between these multiplets being forbidden by the selection
rules, e.g. (6s) (6p) 3P0 → (6s) (6s) 1S0 is not allowed.

Example 4
The spin-orbit interaction breaks the degeneracy of a given LS level into levels
with different J values. It may be observed that the average of the L.S interaction,
summed over all the states of a given LS level, is zero, i.e.

Σ Σ L⋅S = 0 (5.78)
MS ML

(this follows from the fact that with a given orientation of S, for every term with
a given L, there is another term with –L). Now, the summation over the states
can equally well be carried over MJ and J, which implies that

Σ Σ L.S = 0 (5.79)
J MJ

For a given J, the expectation value is the same for all MJ values, so that this
relation is equivalent to
Atoms and Molecules 169

Σ (2 J + 1) L ⋅ S = 0 (5.80)
J

Since spin-orbit interaction in proportional to L.S, Eq. (5.80) gives

Σ (2 J + 1) (∆E )spin-orbit = 0 (5.81)


J

Therefore, the weighted average ELS can be used

ELS = Σ (2 J + 1) EJ / Σ (2 J + 1) (5.82)
J J

to study the atomic energy levels without spin-orbit interaction.

Example 5
The calculation of the energy levels of a many-electron atom is in general quite
difficult. For the helium atom, a perturbative estimation of the ground state
energy can be made.
The Hamiltonian for the helium atom is
1 Z ′e 2 1 1 e2 1
H= ( p12 + p22 ) −  +  +
2me 4πε0 r
 1 r2  4 πε 0 | r −
1 r2 |
(5.83)
Taking the unperturbed Hamiltonian as

1 Z ' e2 1 1
H0 = ( p12 + p22 ) −  +  (5.84)
2me 4π∈0  r1 r2 
with Z′ representing the screened charge of the nucleus, and the perturbation as

e2  1 1  e2 1
lV = − ( Z − Z ′)  +  + ...(5.85)
5πε0  r1 r2  4πε0 | r1 - r2 |
A good perturbative estimaton of the energy can be obtained of the
perturbation λV is small. It is plausible to ‘optimize’ the smallness of the
perturbation by requiring that the expectation value of λV is zero,

λV = 0 (5.86)
which will determine Z'.
The unperturbed ground-state wave function is
Z '3  Z' 
ψ0 (r1, r2) = 3
exp  − (r1 + r2 )  (5.87)
π a1  a1 
and the ground state energy is
170 Elements of Modern Physics

e 2 Z '2
E0 = – (5.88)
4πε 0 a1
The calculation of the expectation value of λV is straight forward, and gives
( Z − Z ′) Z ′ e 2 5e 2 Z ′
λV = – + (5.89)
2πε 0 a1 32πε 0 a1
From the condition in Eq. (5.86,)
5
Z' = Z – (5.90)
16
which gives an estimation of the screening of the nuclear charge. With this
value for Z', the ground state energy is
2
e2  5
E=– Z −  (5.91)
4πε 0 a1  16 
On substracting this expression from the energy of the ionized state, the
ionization energy is

e2  5
2
1 2
E=–  Z −  − Z  (5.92)
4πε0 a1  16  2 
For the helium atom Z = 2, so that
I ≈ 23.1 eV (5.93)
which is in very good agreement with the experimental value of 24.6 eV. For the
singly ionized lithium, Li+, the ionization energy from Eq. (5.92) with Z = 3,
comes out to be 74.1 eV which may be compared with the experimental value
of 75.6 eV.

Example 6
The highest-energy characteristic x-ray lines are obtained from U92. The energies
of K-lines are estimated from
E ≈ 13.6 Z2 eV (5.94)
≈ 115 keV
with a corresponding wavelength of about 0.11 Å.

Example 7
A knowledge of the molecular dissociation energy enables the estimation of its
repulsive energy.
For a KCl molecule, the dissociation energy is 4.42 eV (E = – 8.76 eV), so
that from Eq. (5.59),
Atoms and Molecules 171

14.4 b
– 8.76 = – 3.80 – + n (5.95)
r r
The equilibrium condition implies that at the equilibrium separation r0
14.4 bn
2
+ n +1 (5.96)
r0 r0
Substituting this in Eq. (5.95)
14.4  1
– 4.96 = – 1 −  (5.97)
r0  n
From the information that r0 ≈ 2.79 Å, n ≈ 25. This is an overestimation and
suggests that the terms that have been neglected (such as the van der Waals
attraction) are important in the determination of n. The equilibrium value of r0
gives the result that the net repulsion is about 0.2 eV at r = 2.79 Å.

PROBLEMS

2
1. Show that the expectation value (r1 - r2 ) is greater for ψ – than for ψ+

1
we here ψ± = [ψi (r1) ψj (r2) ± ψi (r2) ψj (r1)], ψi and ψj being
1/ 2
2
orthogonal to each other. This would suggest that the two particles are
closer together in the symmetric states.

2l + 1
2. From the relation Σ | Ylm (θ, φ) |2 = , show that the charge density
m 4π
of a closed shell is isotropic.
3. Show that the sum of the degeneracies for the (ns) (n' l) system is 4 (2l + 1)
and for the (np) (n' l) sytstem it is 12 (2l + 1) (assume that the electrons
are inequivalent). What are the sums of the degeneracies for equivalent
electrons?
4. Show the energy levels of C in a diagram similar to Fig. 5.6, and indicate
the first few allowed transitions.
5. Discuss the energy levels in the j-j coupling scheme for two valence
electrons (nd) (n' d). What happens if n = n′?
6. What are the ground-state terms for elements from K to Zn?
7. The wavelengths corresponding to transitions (6s) (6d) 3D2 → (6s) (6p)
(3P1, 3P2) in mercury are 3125.66 Å and 3654.83 Å respectively. What is
the value of CLS in Eq. (5.25) for the L = 1 and S = 1 state?
172 Elements of Modern Physics

8. Discuss the energy level diagram of the valence electron in the sodium
atom. If the 2P1/2 to 2S1/2 transition corresponds to a wavelength of 5895.923
Å, what is the minimum energy of the bombarding electrons required to
excite this Na line? (Assume that the Na atoms are in the ground state.)
9. Using experimental information (Table 5.3) about the energy levels of Ag,
determine the minimum potential required across the x-ray tube, to excite
the K lines and the L lines. What are the wavelengths of the Kα lines?
What are the frequencies of K and L absorption edges?
10. If the Kα1 radiation from silver is incident on a material, what is the largest
Z value of the material for which the K electrons can be ejected (use
Moseley’s law)? What is the kinetic energy of the ejected electron for
Cu?
11. Given that the K-absorption edges for lead is 0.140 Å, and the minimum
voltage required for producing K lines in lead is 88.6 keV, determine the
ratio of h/e.
12. For Cu, determine the kinetic energy of the Auger electron for the transition
in which two vacancies are created in the LI shell in filling up a K-shell
vacancy (some simplifying assumptions may be required.)
13. Assuming that Na+ and Cl– behave like hard balls or radii 1.0 Å and 1.8 Å
respectively (as far as repulsive forces are concerned), estimate the
dissociation energy for a NaCl molecule. The ionization potential of Na is
5.1 eV and the electron affinity for Cl is 3.8 eV.
14. For the HCl molecule, lines are found as v/c equal to 2944, 2926, 2908,
2866, 2844, 2821 cm–1. Determine the force constant for the vibrational
motion, and the distance of separation for the ions.
15. The rule of equal spacing is not strictly valid for the vibrational-rotational
band. Calculate the change in the energy if the moment of inertia in the
two vibrational states is different, say I0 and I1. Estimate (I1 – I0)/I0 for
HCl for the states corresponding to the spectrum observed in Problem 14.
6
Interaction with External Fields

Structures of the Chapter


6.1 The Hamiltonian
6.2 Atoms in a magnetic field
6.3 Interaction with radiation
6.4 Spontaneous transitions
6.5 Lasers and masers
6.6 Applications of lasers
6.7 Some experimental methods
6.8 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 173
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_6
174 Elements of Modern Physics

In Chapter 5, the structure and the energy levels of atoms and molecules were
discussed. Here, their interaction with external electromagnetic fields will be
considered. While the time independent fields allow the investigation and
modification of the energy levels to suit our convarience, it is the time dependent
fields which lead to transitions between the states. These effects are of great
importance not only in deducing atomic and molecular properties, but also in
devising useful practical applications such as the lasers and masers.

6.1 THE HAMILTONIAN


The Hamiltonian for the ith electron in the presence of external magnetic and
electric fields is given by

1 q
H(i) = [− i ∇i − qA(ri , t )]2 − si . B + V (ri , t ) (6.1)
2m m
where V/q and A are the electrostatic and electromagnetic potentials. For the
special case of the external magnetic and electric fields being constant,
1
A (ri, t) = – ri × B
2
V (ri, t) = – q ri . E + Vint (6.2)
where B and E are the constant magnetic and electric fields, respectively, and
Vint is the potential due to the nucleus and the other electrons. Substitution of
these expressions in Eq. (6.1), after some simplification, leads to

2 2 q q2
H(i) = − ∇ i− (l i + 2si ) . B + (r × B ) 2
2m 2m 8m
– q ri . E + Vint (6.3)
where, except for the extremely strong magnetic fields (such as B ~ 109 G), the
quadratic term in B can be neglected. Therefore, the Hamiltonian for the atom
is given by
H = H0 + H1 + H2 + H3 + H′ (6.4)
where H0, H1 and H3, defined in Eq. (5.14), describe the atom in the absence of
the external fields, and with q = – e, e > 0,
e
H′ = ∑ (li + 2si ) . B + e Σi ri . E
2m i
(6.5)

We first consider the case of E = 0, which is known as the Zeeman effect for
weak B field and as the Paschen-Back effect for strong B field. The interaction
with a constant external electric field leads to what is known as the Stark effect
and is discussed as an example in Sec. 6.8. The interaction with radiation is
important in transitions and is discussed in Sec. 6.3.
Interaction with External Fields 175

6.2 ATOMS IN A MAGNETIC FIELD


For atoms in a constant magnetic field, the atoms energies are perturbed by the
interaction
e
H′ = (L + 2S) . B (6.6)
2m
where L and S are the total orbital and spin angular momenta respectively.
Without loss of generality, the magnetic field can be taken to be in the z-direction.
It is convenient to consider the effect of H′ given in Eq. (6.6) separately for a
weak magnetic field (Zeeman effect) and a strong magnetic field (Paschen-
Back effect). Most of our considerations will be for atoms with LS coupling in
their unperturbed states, through these considerations can easily be extended to
j-j coupling as well.
In obtaining the expression for the energy levels in the presence of an external
magnetic field, the following important result will be required:
Theorem. If J is a sum of two angular momentum operators L and S which
operate in different spaces,
J = L + S, [L, S] = 0 (6.7)
then

∫ ψ *J M J
Sψ J , M ′J d τ = a ∫ ψ *J M ′J Jψ J , M ′J d τ (6.8)
where a is a constant, independent of MJ and MJ′.
This theorem (see Example 1 in Sec. 6.8 for the proof) essentially implies
that S is proportional to J within the sub space of states with a given value of J.
A similar result is also valid for L.

Zeeman Effect
For the weak magnetic field, H′ is regarded as a small perturbation. In the LS
coupling scheme, the states are characterized by the quantum numbers L, S, J
and MJ, so that the perturbation in energy is obtained by using Eq. (6.8) as
e
∆E = g B . 〈 L, S , J , M J | J | L , S , J , M J 〉 (6.9)
2m
where the notation indicates taking an average with respect to the states with
given L, S, J, and MJ ; and g is defined by the relation
L + 2S = g J (6.10)
in the subspace of states with a given J. The constant of proportionality g, is
determined by taking the scalar product of Eq. (6.10) with J, and calculating
the expectation value between the states with given L, S, J and MJ:
L, S , J , M J | (L + 2S) . J | L, S , J , M J
2
= g L, S , J , M J | J | L, S , J , M J (6.11)
where gives (using 2L . J = L + J – S and 2S . J = S + J – L )
2 2 2 2 2 2
176 Elements of Modern Physics

J ( J + 1) + L( L + 1) − S ( S + 1)
g=
2 J ( J + 1)
J ( J + 1) + S ( S + 1) − L( L + 1) (6.12)
+
J ( J + 1)
This on simplification leads to
3 S ( S + 1) − L( L + 1)
g= + (6.13)
2 2 J ( J + 1)
The quantity g is called the Lande g-factor. The shift in the energy, due to
the magnetic field taken along the z-direction, is given by
eh
∆E = gB M J (6.14)
2m
which implies that the degenerate states with a given J split into 2J + 1 equidistant
levels. This is known as Zeeman effect, and is illustrated in Figs. (6.1) and
(6.2).
The Lande g-factor is often derived from what is called the vector model.
In this model, the L + 2S vector is supposed to precess rapidly around J so that
for the purpose of taking averages, only the component of L + 2S along J is
considered,
(L + 2S) . J
〈L + 2S〉 = J (6.15)
J2
which again leads to the expression in Eq. (6.11) with g given by Eq. (6.13).
The weak-field approximation is valid if the energy shift in Eq. (6.14) is
small compared to the fine-structure splitting. The energy shift for ordinary
fields, is rather small, e.g. for B ≈ 104 G (i.e. 1Wb/m2), the energy splitting is of
the order of 0.58 × 10–4 eV for g = 1.
The selection rules for transitions between the MJ multiplets of two levels,
are:
∆MJ = 0, ± 1 (6.16)
and the shift in the frequency of the radiation emitted is given by
eB
∆ω = ( gM J − g ′M ′ J ) (6.17)
2m
MJ′ – MJ = 0, ± 1
It is observed that the frequency shifts of the spectral lines have a simple
relation if the levels do not have fine structure, i.e. S = 0 (singlet states). In this
case, since ∆S = 0 for electric dipole transitions, one has J = L for which
g = g′ = 1, (S = S′ = 0) (6.18)
The shifts of the spectral lines in this case are
eB
∆ω = 0, ± (6.19)
2m
Interaction with External Fields 177

This is known as normal Zeeman effect, and results in each line splitting
into three lines symmetrically placed about the unshifted line, one of which is
the unshifted line (see Fig. 6.1). It may be noted that the shifts for ordinary
magnetic fields are quite small, ∆ω ~ 8 × 1010 rad/s for B ~ 104 G, compared to
ω ~ 3 × 1015 rad/s for visible light.

MJ MJ
2 2
1 1
1D2 0 0
–1 –1
–2 –2

1 1
1P1 0 0
–1 –1

w0 – Dw w0 + Dw

w0
(a) (b)

Fig. 6.1 (a) The energy levels of 1P1 and 1D2 states as a function of the
magnetic field, and (b) the splitting of energy levels into three components
illustrating normal Zeeman effect.
If the states have fine structure arising from spin-orbit interaction, the spectral
lines break into more than three components, and the frequency shifts are given
by rational fractions of the normal Zeeman shift,
p
∆ω = ∆ ω0
q
eB
∆ω0 = (6.20)
2m
where p and q are integers. This case is known as anomalous Zeeman effect. As
a specific example, consider the splitting of the alkali doublet lines which
2
correspond to the transitions P1/2 → 2 S1/2 and 2
P3/2 → 2 S1/2 . The Landé
g-factors for these states are obtained from Eq. (6.13) as
178 Elements of Modern Physics

2
S1/2 : g = 2
2
P1/2 : g = 2/3 (6.21)
2
P3/2 : g = 4/3
Substituting the values in Eq. (6.17) gives
∆ω = (± 2/3, ± 4/3) ∆ω0 for 2 P1/2 → 2 S1/2 (6.22)
2
and ∆ω = (± 1/3, ± 1, ± 5/3) ∆ω0 for P3/2 → 2 S1/2 (6.23)
The energy levels as a function of the field B, the allowed transitions, and
the splitting of the spectral lines, are shown in Fig. 6.2. It is worth noticing that
some of the fine structure lines cross each other for (e B/m) ~ ∆E. At these
values of B, there is a mixing of states which, under some circumstances, causes
a sharp change in the intensity of radiation emitted. This phenomenon is known
as Hanle effect (particularly, the case of crossover at B = 0), and has been used
to determine the constants involved in the fine structure multiples. Of course,
for e B/m ~ ∆E, the perturbative analysis is not strictly valid (H′ ~ H2), and a
more complicated, nonperturbative analysis has to be carried out.

ML = 1, MS = 1/2
E – E0
DE
ML = 0, MS = 1/2
1
2
P3/2 1.0 2.0 (eB/mDE)
0 ML = 1, MS = –1/2
2
P1/2 ML = –1, MS = 1/2
–1
ML = 0, MS = –1/2

ML = –1, MS = –1/2

MS = 1/2

2
S1/2
w0 w1
MS = –1/2

(a) (b)

Fig. 6.2 (a) The energy levels of 2S1/2, 2P1/2 and 2P3/2 states in the presence of a
magnetic field, in units of the fine structure splitting ∆E. (b) Splitting of
the spectral lines for transitions from 2P1/2 and 2P3/2 states to 2S1/2
states at e B/m = 0.5 (∆E).
Interaction with External Fields 179

Paschen-Back Effect
If the magnetic field is so strong that the splitting of the energy levels due to the
magnetic field is larger than the fine structure separation, is known as Paschen-
Back effect. In this case, H′ in Eq. (6.6) is treated as the main term and H2
representing the spin-orbit interaction as a small perturbation, which makes the
calculations relatively simple.
In the absence of spin-orbit interaction, the unperturbed states may be
specified by the quantum numbers L, S, ML and MS. Taking the magnetic field
along the z-direction, the energy shift due to the magnetic field is given by
Eq. (6.6) as:
e
∆E′ = ( M L + 2M S ) B (6.24)
2m
and the line-splitting is an integral multiple of ∆ω0. The selection rules for
electric dipole transitions in this case are:
∆ML = 0, ± 1, ∆MS = 0 (6.25)
so that we essentially get back the normal Zeeman shifts
eB
∆ω = 0, ± (6.26)
2m
This is expected since the only role played by spin here is to change the
energy levels by an amount eMbB/m which, in view of the selection rules in
Eq. (6.25), does not affect the frequency associated with the transitions.
The effect of spin-orbit interaction can be included perturbatively by using
the theorem stated earlier but applied to li whose sum if L, and also to si shows
sum is S. The expectation value of the spin orbit interaction can then by written
as
〈L, S, ML, MS | H2 | L, S, ML, MS〉
= a 〈L, S, ML, MS, | L.S | L, S, ML, MS〉 (6.27)
where a is independent of ML and MS. Since LxSx + LySy [which can be written as
1
(L+S– + L–S+)] changes the values of ML and MS, the only term which
2
contributes in Eq. (6.27) is the LzSz term. Therefore, the total energy shift is
given by
e
∆E = ( M L + 2M S ) B + a 2 M L M S (6.28)
2m
Consider the effect of this term on the 2P levels shown in Fig. 6.2(a). The
allowed values for ML are 1, 0, – 1 and for 2MS, 1, – 1, so that ML + 2MS can take
on the values 2, 1, 0, – 1, – 2. Thus, the 2P levels are split into five equidistant
levels by the first term in Eq. (6.28) [the lines in Fig. 6.2(a) for B → ∞]. This
180 Elements of Modern Physics

equidistance is removed by the second term which, for these levels, has a value
of a2, 0, – a2/2, 0, – a2. The shifts in the transition frequencies are now given
by
∆ω = (∆ω0 + aMS) ∆ML (6.29)
For the P → S transitions, the frequency shifts for MS = 1/2 are slightly
2 2

larger in magnitude (a is generally positive) than those for MS = – 1/2. Thus, we


get five lines, two doublets i.e., for ∆ML = ± 1, MS = ± 1/2, and a singlet which
is the unshifted line corresponding to ∆ML = 0.

j-j Coupling
In the analysis so far, it has been assumed that LS coupling is valid. The theory
can easily be modified to apply to heavy atoms where j-j coupling is dominant.
The effect of the magnetic field on two electrons with j-j coupling is briefly
discussed here.
In j-j coupling, the states are characterized by j1, j2, J and MJ. The energy
shift due to the interaction of the two electrons with a weak external magnetic
field is given by
e
∆E = 〈 j1 , j2 , J , M J | l1 + 2s1 + l 2 + 2s 2 | j1 , j2 , J , M J 〉 ⋅ B
2m
(6.30)
Using the results of the theorem in Eq. (6.8), we can write
〈j1 | l1 | j1| j1〉 = a1 〈j1 | j1 | j1〉 (6.31)
〈j1 |s1 | j1〉 = b1 〈j1 | j1 | j1〉
and similar relations for l2 and s2. The constants ai and bi are determined by the
following steps similar to those leading to Eq. (6.13) giving
1 li (li + 1) − si ( si + 1)
ai = + (6.32)
2 2 ji ( ji + 1)
1 si ( si + 1) − li (li + 1)
bi = =
2 2 ji ( ji + 1)
where i = 1, 2. Then one gets
e
∆E = 〈 j1 , j2 , J , M J | (ai + 2b1 ) ji
2m
+ ( a2 + 2b2 ) j2 | j1 , j2 , J , M J 〉 ⋅ B (6.33)
Again, applying the theorem in Eq. (6.8) to ji whose sum is J, gives
〈J | j1, 2 | J〉 = A1, 2 〈J | J | J〉 (6.34)
Interaction with External Fields 181

with
1 j1 ( j1 + 1) − j2 ( j2 + 1)
A1 = +
2 2 J ( J + 1)
1 j2 ( j2 + 1) − j1 ( j1 + 1)
A2 = + (6.35)
2 2 J ( J + 1)
Finally, one gets
eB
∆E = M J [ A1 (a1 + 2b1 ) + A2 (a2 + 2b2 )] (6.36)
2m
which again results in (2J + 1) equidistant energy levels.
In the strong field case, the unperturbed states are characterized by the
quantum numbers j1, m j1 and j2, m j2 so that the energy shifts can be obtained
from the relations in Eq. (6.31) as
eB
∆E = [(a1 + 2b1 ) m j1 + (a2 + 2b2 )m j2 ] (6.37)
2m
To this, the contribution of the spin-orbit interaction can be added, which
also can be estimated by using arguments similar to those used in the discussion
for the LS coupling. It gives a contribution proportional to m j1 m j2 .
In all discussions so far, only the effect of the linear term in B in Eq. (6.3)
has been considered. The quadratic term becomes important for atoms for which
the magnetic dipole moment is zero, e.g., He, Ne, etc. which have L = S = 0.
The magnetic properties of these materials such as magnetic susceptibility, are
determined by the quadratic term. The quadratic term is also important in
astrophysics where enormously large magnetic fields are encountered (in pulsars
and neutron stars) and is some solid state problems. The energy shift due to the
quadratic term is known as the quadratic Zeeman effect.

6.3 INTERACTION WITH RADIATION


The properties of atoms and molecules and their interactions are observed when
there are changes in their states. Their interaction with radiation is the most
important mechanism through which these changes take place. Quantum
mechanics provides a very satisfactory description of the interaction of matter
with radiation. Here, an elementary treatment of emission and absorption of
radiation is presented.
Consider radiation of angular frequency ω, incident on a particle with charge q.
The wavelength of radiation normally encountered is of the order of 103 Å
182 Elements of Modern Physics

which is quate large compared with atomic sizes (~ 1 Å). Therefore, the electric
field can be regarded as being constant over atomic distances (this is called the
electric dipole approximation). Such an electric field is provided by the scalar
potential – Ez cos ωt, where it is assumed that the electric field is in the
z-direction and has an amplitude E. The interaction of the charged particle with
this scalar potential leads to the potential energy term
V = – q Ez cos ωt (6.38)
The effect of this interaction on a bound state can be treated perturbatively.
Let the particle be in a bound eigenstate φ0 of the Hamiltonian H0, before
the radiation is incident on it. After the radiation is introduced at t = 0, the
particle can undergo a transition to any of the other eigenstates φn where
H0 φn = Enφn (6.39)
which satisfy the orthonormality conditions

∫φ m ∗ φn d τ = 0 for m ≠ n

= 1 for m = n (6.40)
The state of the particle can then be represented by

φ= ∑ n
an (t ) exp (− iEn t / )φn (6.41)

where | an (t) |2 represents the probability of finding the particle in state φn at


time t, with the boundary conditions
a0(0) = 1, an (0) = 0 for n ≠ 0 (6.42)
Substituting the expression in the Schrödinger equation,
∂φ
(H0 – qEz cos ωt) φ = i (6.43)
∂t
gives
∂an (t )
i
n
∑ exp (− i ωn t )
∂t
φn = – qEz (cos ωt) φ, ωn = En/ (6.44)

Multiplying both sides by φm* and integrating, and using the orthonormality
conditions gives
∂a (t )
i m
∂t
= – qE exp (iωm t) (cos ωt) φm∗ zφd τ ∫ (6.45)
If the incident radiation is not very strong, we can approximate φ in Eq.
(6.45) by its unperturbed expression, i.e., φ ≈ exp (– i ω0t) φ0, and get as a first
order approximation
Interaction with External Fields 183

iqE t
am(t) =

zm 0
0 ∫
exp [ωm − ω0 )t ]
cos (ωt) dt, m ≠ 0, (6.46)
where we have used the boundary conditions in Eq. (6.42), and

zm0 = ∫φ m ∗ zφ0 d τ (6.47)


Carrying out the integration over t leads to
qE  exp [i (ωm − ω0 + ω) t ] − 1
am(t) = zm 0 
2  ωm − ω0 − ω
exp [i (ωm − ω0 − ω) t ] − 1 
+  (6.48)
ωm − ω0 + ω 
For sufficiently large t [i.e., t >> 1/ωm – ω0)], the magnitude of the coefficient
is appreciable only for ω ≈ ± (ω0 – ωm). The first case, namely
ω ≈ E0 – Em (6.49)
corresponds to emission of radiation of energy ω, while the second case,
ω ≈ Em – E0 (6.50)
corresponds to absorption of radiation of energy ω. In both the cases, the
probability of finding the particle in state φm at t, is
2
q2 E 2 2 sin [(ωm 0 − ω)t / 2]
| am(t)|2 ≈ | z m 0 | (6.51)
2 (ωm 0 − ω) 2
where ωm0 = | ωm – ω0 | . Since | zm0 | = | z0m |, it also follows that if the particle is
originally in state m, the probability of finding the particle in state φ0 at t, | a0(t) |2,
is given by the same expression as in Eq. (6.51). Thus, the external field E induces
or stimulates transitions 0 → m and transitions m → 0 with the same probability.
This important conclusion may be stated as an equality
Pn → m (E) = Pm → n (E) (6.52)
for induced transition probabilities. Another important fact to be noted is that
transitions are allowed between states which do not conserve energy, i.e.,
| Em – E0 | ≠ ω. However, for t >> 1/ωm0, the probability |am (t) |2 is significant
only for ωm0 – ω ~ 1/t, (for t → ∞ it is proportional to t2), and rapidly goes to
small values as | ωm0 – ω | becomes larger than 1/t. Thus energy conservation is
applicable to the extent of the uncertainty
(∆E) (t) ~  (6.53)
which is a manifestation of the uncertainty relation discussed in Eq. (3.64). For
t → ∞, the uncertainty (ωm0 – ω) tends to zero which essentially restores the
energy conservation relation | Em – E0 | = ω.
In the analysis so far, it has been assumed that the incident radiation has
only one frequency. In reality, it has a frequency distribution. The changes
184 Elements of Modern Physics

required by this situation can be introduced by replacing the energy flux


1
ε0 cE 2 (ω) by
2
1 dI
2
ε0 c E 2 (ω) →
dω ∫
dω (6.54)

where dI/dω is the flux density per unit frequency. This gives

2q 2 sin 2 [(ωm 0 − ω) t /2] dI


| am (t) |2 =
ε 0 c 2 ∫ | zm 0 |2
(ωm 0 − ω) 2 dω
dω (6.55)

For large t, the integrand being sharply peaked at ω = ωm0, dI/dω can be
evaluated at ω = ωm0, and the remaining integration can be carried out to give

q 2 πt dI
| am(t) | =
2 | zm 0 |2 ,
ε 0 c 2
d ω ω = ωm 0

 sin 2 ax 

 x ∫
2
dx = πa 

(6.56)

The transition probability per unit time is | am (t) |2/t:


q2π
W0 → m = | ( n . r ) m 0 |2 u (ωm 0 ) (6.57)
ε0 2
where cu (ω) = dI/dω and z has been replaced by the more general quantity n .
r (n is a unit vector in the direction of the field). This expression is valid for
absorption and emission induced or stimulated by the external field, and describes
what are known as electric dipole transitions.
The important point to be noted in the transition probability, is that the
transition takes lace only for the frequency w = | Em – E0 |/; Em > E0 for absorption
and Em < E0 for induced emission. The transition probability increases as the
flux density dI/dω increases. Finally, it should be mentioned that the
approximation of the electromagnetic field being constant in space is not valid
if (n . r)n0 is zero (known as forbidden transitions) In this case, a more general
approach, which takes into account the space-dependence of the electric field,
and also the associated magnetic field, does allow the transitions but with reduced
rates. They are known as electric quadruple transitions, magnetic dipole
transitions, etc.

6.4 SPONTANEOUS TRANSITIONS


If atomic transitions were induced only by external fields [according to
Eq. (6.57)], an excited, isolated atom would remain in the excited state for an
indefinitely long period. However, excited atoms are found to undergo transition
Interaction with External Fields 185

to a lower-lying state even in the absence of external fields. Einstein (1915)


showed on the basis of thermodynamic arguments, that in addition to the induced
or stimulated emissions, spontaneous emission of radiation by an excited state
must also be present and deduced the relations between the different transitions.
Let Pmn be the probability for stimulated transition from state n to state m. It
is reasonable to assume (see Sec. 6.3) that stimulated transitions are proportional
to the density of radiation u (ωnm). The probabilities for stimulated transitions
between states m and n, can then be written as
Pmn = Bmn u (ωnm) (6.58)
Pnm = Bnm u (wnm) (6.59)
In addition, let Anm be the probability for spontaneous transitions from the
higher state m to the lower state n (Em > En). The number of transitions Nji from
state i to state j is given by the product of the number of atoms in state i and the
total probability for the transition. Therefore Nnm and Nmn are given by
Nnm = Nm [Bnmu (ωnm) + Anm]
Nmn = Nn Bmn u (ωnm) (6.60)
The constants A and B are known as the Einstein coefficients. When the
system is in thermal equilibrium
Nnm = Nmn (6.61)
and the Boltzmann distribution ratio is
Nn
= exp [– (En – Em)/kT] = exp (ωnm/kT) (6.62)
Nm
Using these conditions in Eq. (6.60),
Bnm u (ωnm) + Anm = exp ( ωnm/kT) Bmn u (ωnm) (6.63)
Anm
or u (ωnm) = (6.64)
Bmn exp ( ωnm / kT ) − Bnm
This expression can be compared with Planck’s law for blackbody radiation
[see Eq. (2.13)]
 ω3 / π2 c3
u (ωnm) = (6.65)
exp ( ω/ kT ) − 1
One therefore deduces that
Bmn = Bnm (6.66)
in conformity with the earlier conclusion in Eq. (6.52), and
ω3
Anm = Bnm (6.67)
π2 c 3
The quantity Bnm can be obtained by comparing Eqs. (6.58) and (6.59) with
Eq. (6.57) (Wn → m = Pmn). It is to be noted that the u(ωnm) deduced corresponds
to isotropic blackbody radiations that the expression in Eq. (6.57) should be
averaged over different directions which essentially leads to the replacement:
186 Elements of Modern Physics

1
| (n . r)nm |2 → | (r )nm |2 (6.68)
3
Equating Wm → n (with the above replacement) and Pnm then yields

q2π
Bnm = 2
| (r ) nm |2 (6.69)
3ε0 
so that
2q 2 ω3
Anm = 3
| (r ) nm |2 (6.70)
3ε 0 hc
This is the expression for the probability of spontaneous electric dipole
transitions.

Selection Rules
It is observed that the induced and the spontaneous transition probabilities
[Eqs. (6.57) and (6.70)] depend on the same matrix element, (r)nm. Therefore
these electric diole transitions are allowed only if this matrix element is nonzero.
This imposes certain conditions on the allowed transitions. In particular, it is to
be noted that since r is odd under parity transformation (i.e., r → – r) the
product of ψ*n and ψm also should be odd. If these are single-particle, angular
momentum states [see Eq. (3.153)] with orbital angular momentum quantum
ln + lm
numbers ln and lm, the parity of the product of these states is (−1) so that
(lm + ln) and therefore (ln – lm) are odd. In addition, the angular dependence of r
is of the form Ylm (θ, φ) from which it can be shown that | ln – lm | = 1 and
| jm – jn | = 1, 0, jm and jn being the total angular momentum quantum numbers.
Thus the allowed electric dipole transitions satisfy selection rules:
∆l = ± 1, ∆j = ± 1, 0, ∆s = 0 (6.71)
where the ∆s = 0 result follows from the fact that spin is unchanged in the
transitions. More detailed arguments also shown that
∆mj = ± 1, 0, j = 0 →
/ j=0 (6.72)

Lifetimes and Linewidths


The existence of a finite probability for the spontaneous transition from an
excited state to a lower state means that the excited state has a finite lifetime.
The finite lifetime gives rise to an uncertainty in the energy of the state,
∆E ~ /τ (6.73)
This uncertainty is reflected in the emitted radiation having a spread in the
distribution of its frequency and leads to the observed width of the spectral
lines called the natural linewidth,
Interaction with External Fields 187

∆ω = 1/τ (6.74)
The average lifetime τ and therefore the linewidth are related to the transition
probability.
If A is the transition probability, the number of particles (– dN) which undergo
transition in time dt is [see Eq. (1.80)]
dN = – AN (t) dt (6.75)
which on integration gives
N(t) = N(0) e–At (6.76)
Now, the average lifetime of the particles is
N (ti ) − N (ti + ∆t )
τ= ∑t i
i
N (0)

= A ∫
0
te − At dt (6.77)

1
=
A
so that the linewidth in Eq. (6.74) is equal to the transition probability A. For
most atomic systems which admit to electric dipole transitions, τ ~ 10–8 s. This
gives rise to a spread of ∆ω ~ 108 s–1. For λ ~ 5000 Å, the corresponding spread
in the wavelength is
∆λ ~ 10–4 Å (6.78)
For some excited states which are stable against electroid dipole transitions,
e.g., the 2 2S1/2 state in the hydrogen atom, known as metastable states, the
lifetime is usually about 105 times larger, i.e., τ ~ 10–3 s. Metastable states play
a very important role in lasers and masers.
There are other effects which also contribute to the observed linewidth.
One of them is due to Doppler effect. Since the atoms are moving around (thermal
motion), the observed radiation is Doppler shifted from the frequency ω0
expected for atoms at rest. If the velocity of the particle makes an angle of
α with the line of observation, the observed frequency is
 v 
ω ~ ω0 1 − cos α  (6.79)
 c 
For v ~ 6000 m/s (corresponding to atomic hydrogen at about 1400 K), and
λ ~ 5000 Å, the Doppler shift is
∆ω ~ 7.5 × 1010 rad/s.
∆λ ~ 0.1 Å (6.80)
Another phenomenon that contributes to the linewidth is atomic collisions
which effectively change the lifetime of the excited states. The observed
linewidth ∆ω is the sum of the linewidths arising from the different effects.
188 Elements of Modern Physics

6.5 LASERS AND MASERS


It was noted in Secs. 6.3 and 6.4 that, for an atom (or a molecule) in the presence
of radiation, the stimulated probabilities for absorbing a photon of resonant
energy (ω = Em – En) or emitting a photon of the same energy are equal. This
is in addition to the probability for spontaneous emission of radiation. Neglecting
the spontaneous transitions, the net rate of absorption per unit time is
dE
= (Nn – Nm) Bnm u (ωnm), Em > En (6.81)
dt
At thermal equilibrium one has
Nn
= exp ( ωnm/kT) (6.82)
Nm
so that Nn > Nm, and radiation will be absorbed. However, if the initial condition
dE
is such that Nn < Nm, known as population inversion, then < 0 and there is
dt
a net emission of radiation of frequency ω = (Em – En)/. This leads to an
amplification of the incident wave and forms the basis of lasers and masers
(those words stand for light/microwave amplification by stimulated emission
of radiation). It is of importance to note that the emitted wave has the same
frequency, phase, and polarization as the incident wave, so that the amplification
is in the wave amplitude, not just the intensity.
Though population inversion is not an equilibrium condition, it is sometimes
convenient to think of it as corresponding to a negative temperature in Eq. (6.82),
for which Nn < Nm.
Similarly, since the radiation is increasing in intensity, the situation can be
described by the usual formula
I(x) = I(0) e–αx (6.83)
for the propagation through an absorbing medium but with a negative co-efficient
of absorption.
The mechanism of population inversion characterizes the different lasers
and masers. A few of them are discussed here.
The ammonia maser (Townes, 1954): This is based on the separation of the
components of an ammonia beam. The NH3 molecule has a pyramidal structure
with N at the apex. However, the N atom can be on either side of the base and
can tunnel from one side to the other through the base. As a result, the two
degenerate ground states split into two closely-spaced energy levels, which are
described essentially by the even and odd linear combinations of the two states
and have an energy separation given by
Interaction with External Fields 189

∆E/h ≈ 2.387 × 1010 s–1 (6.84)


which is in the microwave range (λ ≈ 1.25 cm). At room temperature, these two
levels are almost equally populated. However, the two states have different
electric dipole moments. Therefore, an ammonia beam can be separated into
two beams by subjecting it to a suitable electric field. The beam with higher
energy is taken into a resonating chamber (whose role will be discussed shortly)
where the spontaneously emitted photons will stimulate emissions of similar
photons by other atoms, and the chain process rapidly builds up the amplitude.
Ruby laser (Meiman, 1960): A ruby consists of crystalline aluminium oxide
(Al2O3) in which some of the aluminium atoms are replaced by the chrominium
atoms. The energy levels of Cr3+ ions are shown schematically in Fig. 6.3 (a).
The Cr ions in a ruby rod which is about 1 cm in diameter and 5 cm in length,
are excited from level E1 to a group of levels E3 by the absorption of light from
a xenon flash tube adjacent to the ruby rod (duration of flash is less than 10–3 s).
Since there are many levels near E3, most of the Cr ions will go into the excited
state. The process of imparting energy to the working substance of a laser is
known as pumping—in the present case it is optical pumping since the input
energy is in the optical range.
The excited ions quickly undergo nonradiative transitions with a transfer of
energy to the lattice thermal motion, to the level E2. Now, the E2 level is a
metastable state with a lifetime of about 3 × 10–3 s (usual atomic lifetimes are of
the order of 10–8 s), so that the population of the E2 level becomes greater than
that of the E1 level, and population inversion is obtained.

E3 2s 5s S*n
S*o
3p
E2
3s T*

2
E1 (1s) 2p Sn
So
He Ne
(a) (b) (c)

Fig. 6.3 A schematic representation of transitions in (a) a ruby laser where


most ions from E3, go to E2 though a few go to E1, (b) helium-neon
laser, and (c) tunable dye laser.
Some photons are produced by spontaneous transition from E2 and E1, and
have a wavelength of 6943 Å (ruby rod). The ends of the ruby rod are thoroughly
polished and coated with layers of silver so as to act as reflecting mirrors, one
end reflecting nearly 100% and the other between 90% to 100% of the incident
radiation. Therefore, photons that are not moving parallel to the ruby rod escape
from the side, but those moving parallel to it are reflected back and forth. These
190 Elements of Modern Physics

stimulate the emission of similar other photons and the chain reaction quickly
develops a beam of photons all moving parallel to the rod, which is
monochromatic (well-defined frequency) and is coherent (well-defined phase
and polarization). When the beam develops sufficient intensity, it emerges
through the partially silvered end. The ruby laser is a solid-state laser and operates
in pulses (several pulses per minute). The larger amount of heat released in the
crystal is cooled by liquid air.
Helium-neon laser: An example of a continuously operating laser is the
helium-neon laser (Fie. 6.4). In this laser (Javan, 1960), an electric discharge is
created by a dc current in a tube containing a mixture of helium and neon in the
ratio of 5 : 1. The discharge raises some of the helium atoms into the 2s level
[see Fig. 6.3 (b)] which is a metastable state (i.e., it has a long lifetime). The
energy of this level (20.61 eV) is almost the same as the energy of the 5s level
(20.66 eV) in neon. Hence, the energy of the helium atoms is easily transferred
to the neon atoms when they collide. This preferential transfer of the neon atoms
to the 5s state results in a population inversion between the 5s and the 3p states.
The spontaneous transitions from the 5s state to the 3p state, produce photons
of wavelength 6328 Å, which then trigger stimulated transitions. Photons
travelling parallel to the tube are reflected back and forth between the mirrors
placed at the ends, the rapidly build up into an intense beam which escapes
through the end with the lower reflectivity. The energy taken out by the laser
beam is continuously replaced by the dc supply, so that it is a continuously
operating laser. The usual efficiency of conversion of energy into the laser beam
energy is quite small, about 10–3%.
The hydrogen maser: Since the nucleus of the hydrogen atom has I = 1/2,
the ground state of the atoms splits into two levels with total angular momentum
quantum numbers F = 0 and F = 1, which have a small energy difference. Of
the two levels, the one with F = 0 has a slightly lower energy. The hydrogen
maser is based on stimulated transition from the F = 1 state to the F = 0 state.
Window at Brewster
angle

Emerging beam

Totally reflecting 99% Reflecting


mirror dc voltage mirror

Fig. 6.4 Helium-neon laser. The windows are at Brewster angle


to avoid losses by partial reflection.
For obtaining a population inversion in this maser, a beam of hydrogen
atoms is passed through a region of suitable magnetic field. This allows the
selection of a beam that is richer in the F = 1 atoms. The beam is taken into a
Interaction with External Fields 191

resonance cavity where the F = 1 atoms undergo stimulated transitions to the


F = 0 state. This maser is remarkable for the stability of the frequency of its
radiation, vH = 1.420 405 7518 × 109 s–1 and can be used as a time standard with
nearly the same accuracy as the caesium clock.
Dye lasers: A major constraint in the lasers discussed so far is that the
frequency of the outcoming radiation is essentially fixed, though a small variation
can be achieved by varying temperature, etc. This constraint was removed by
the development of dye lasers which use solutions of organic dyes as the active
medium for stimulated emission, e.g., rhodamine 6 G in methanol solution.
The output wavelength of these lasers is continuously tunable over a large range
of frequencies, which makes them very versatile. The main features of a dye
laser are described here in terms of a schematic representation of the energy
levels in Fig. 6.3(c).
The large degrees of freedom in an organic dye molecule give rise to
relatively broad energy bands with closely spaced vibrational and rotational
levels. At ordinary temperatures, most of the molecules occupy energy levels
close to the ground state level S0 in the lowest energy band S characterized by
the property that the molecule is in a singlet state, i.e., total spin S = 0. If a
solution of the dye is exposed to an intense radiation from a laser, usually a
nitrogen laser, or a flash lamp, the molecules undergo transition to one of the
excited singlet states S*. Nonradiative transitions quickly bring them down (in
about 10–11 to 10–12 s) to the bottom of the S* band, i.e., S0*. Since there are very
few molecules in the upper part of S, population inversion is obtained between
the lower part of the energy band S* and the upper part of the S band. This gives
rise to stimulated transitions to almost any part of the S band. These transitions
can be tuned to any frequency within this range by using a suitable diffraction
grating and a partially reflecting mirror, placed on the opposite sides of the
active medium.
Alternatively, in some cases the molecules may undergo nonradiative
transitions from S* to the metastable triplet states T* with total spin S = 1. Then
stimulated transition can occur between the lowest triplet state T0*, and S states.
Since T0* is metastable, usually the transitions from T0* states to the S states
take place after a time delay. For nitrogen-laser-pumped dye lasers, it is the
transitions from the S0* to the S band that produce the dominant laser action.
Over the years, a large number of materials that can produce laser action
have been developed. A special mention should be made of semiconductor lasers
which are sturdy, compact and inexpensive, and therefore suitable for practical
applications. It is reasonable to expect that many more laser materials will be
developed in the coming years.
Resonance cavity: The laser material is usually kept between mirrors (plane
or concave) so that the photons are reflected back and forth many times to build
192 Elements of Modern Physics

up an intense photon beam (the effective path length is increased by a factor


equal to the number of reflections). One of the mirrors is partially transparent,
between 90% to 100% reflecting, which allows the beam to be taken out. If the
tube windows (Fig. 6.4) are at the Brewster angle, the emerging beam will be
plane polarized.
Apart from increasing the intensity of the beam, the mirrors help in producing
a monochromatic beam in that they serve as walls of a resonance cavity which
sustains only those wavelengths λ which satisfy the relation
pλ = 2t, p = 1, 2, ... (6.85)
where t is the distance between the mirrors. The separation between two
successive modes is
λp – λp + 1 ≈ λ2/2t,
≈ 0.002 Å for λ = 6328 Å, t = 1 m (6.86)
which is quite a bit smaller than the spread in wavelength due to Doppler effect
[Eq. (6.79)]. Thus, there are several frequencies, each with a very narrow width,
supported by the resonance cavity, within the Doppler width of the central
frequency. Some special technique can be used to select one of these frequencies,
such as reducing t which will increase λp – λp + 1, or lowering the temperature
which will decrease the Doppler width.

6.6 APPLICATIONS OF LASERS


Since the laser beam is made up of stimulated emission, it is monochromatic,
has a high temporal coherence, i.e., the phases at different times are related, and
a high spatial coherence, i.e., the phases at different positions are related. It is
parallel and has a very high intensity since the beam has a small cross section.
The temporal coherence is determined by the frequency width of the beam,
∆t ~ 1/∆ω (6.87)
and can be as large as 10 or larger in some lasers. Spatial coherence implies
–6

that the beam is essentially described by a single plane wave with a width equal
to the cross-section of the beam. This gives rise to a directionality constrained
only by the width. If a beam with a cross-sectional area (∆x)2 is travelling in the
z-direction, uncertainty relation implies
∆px ≈ /∆x (6.88)
so that the angular spread is
∆p x   λ
α= ≈    (6.89)
pz  ∆x   h 
≈ 10 rad for λ ≈ 5000 Å, ∆x ≈ 10–3 m
–4
Interaction with External Fields 193

It is the properties of narrow frequency range, coherence, directionality


and high intensity which make the laser beam extremely useful. Some of the
uses and applications of lasers are discussed below.
1. Tunable dye lasers allow the excitation and analysis of atomic and
molecular energy levels with a high accuracy. Indeed, they have
revolutionized the field of optical spectroscopy. In particular, Lamb shift
has been observed optically, two-photon transitions have been observed
in atomic and molecular systems, Rydberg stateshave been analysed,
and significant tests of unified theories of electromagnetic and weak
interactions have been made.
2. The coherence and the high intensity of laser beams enable us to measure
small changes in the frequencies of radiation resulting from Raman
scattering by measuring beats produced by the interference between the
scattered and the original beams.
3. The narrow frequency width and coherence of lasers makes them very
useful in precision measurements. Interferometers with laser beams allow
measurement of distances to a very high accuracy, as also surface
variations, refractive index, etc. For measuring the velocity of fluids, a
laser beam is scattered by the fluid. This Doppler-shifted beam then
interferes with the original beam producing beats. The beat frequency
enables us to measure the velocity of the moving medium.
4. The well-defined directionality of a laser beam makes it valuable in
communications, surveying and tracking systems.
5. Some lasers are capable of producing narrow beams of extremely high
intensity. Such beams find use in precision cutting and boring, soldering
and welding. They are used in tumor destruction and in eye surgery for
‘welding’ detached retina. There is also the possibility that they can induce
controlled thermonuclear fusion.
6. Lasers have important applications in nonlinear optics and in holography.
These are discussed in some detail.

Nonlinear Optics
For ordinary light sources, the electric field is so small that the induced
polarization P is approximately proportional to the electric field E, and the
various properties of the medium such as polarizability a, refractive index, etc.
are independent of the field intensity (here, for simplicity the vector nature of
P and E is neglected). Thus, we have what is known as linear optics for which
the superposition principle holds i.e., P1 = α E1, P2 = α E2 implies P1 + P2 = α
(E1 + E2). However, with laser fields of high intensity, there is no longer a linear
relation between P and E, and the description is in terms of nonlinear optics.
194 Elements of Modern Physics

For high E values, a series expansion for P can be written, giving


P = α1 E(t) + α2 E(t)2 + ...
= α1E0 cos ωt + α2 E02 cos2 ωt + ... (6.90)
Since cos ωt = [1 + cos (2ωt)]/2, the second term builds a field component
2

with a frequency of 2ω. This is called frequency doubling. For example, when a
dielectric medium is irradiated with a powerful ruby laser beam with λ = 6943
Å, an ultraviolet component with λ ≈ 3472 Å is observed to emerge from the
medium. In general, higher harmonics with frequencies 3ω, 4ω, etc. also may
be present.
If two beams with different frequencies, ω1 and ω2, at least one of them
being a laser beam, are incident on the medium, the non-linear term will have
terms with frequencies 2ω1, 2ω2, ω1 + ω2 and ω1 – ω2. The emerging beam
therefore will contain components with these frequencies. Thus, the effect of a
low frequency beam (e.g., ω2 in the infra-red region) may be observed in the
optical region by choosing ω1 in the optical range.
In many substances, the refractive index of the substance increases as the
intensity increases. Thus, the effective refractive index of the material is larger
near the centre of the propagating laser beam so that the rays bend towards the
beam axis. This is known as self-focussing and is again a consequence of non-
linear optics. This property is utilized in fibre-optics communication.

Holography
An extremely interesting application of lasers is to holography, i.e., the
production of the whole or complete, 3-dimensional picture of an object. It is
based on the reconstruction of the electromagnetic fields reflected by the object.
Preparation of the photographic plate: Consider the electromagnetic field
of a laser beam reflected by an object. For simplicity, the reflected beam is
assumed to be a plane wave. This wave with amplitude R is allowed to interfere
with a reference laser beam of amplitude A, at an angle θ, on a photographic
plate [Fig. 6.5 (a)]. The exposed plate registers the interference fringes and is
processed to give a hologram. The intensity registered is [Fig. 6.5 (a)]
I = | A exp (i2π(x – ct)/λ) + R exp (i2π (n . r – ct)/λ)|2x = 0
= A2 + R2 + 2AR cos [2πy (sin θ)/λ] (6.91)
where it is assumed that A and R are real. The separation between the interference
fringes is
λ
d= (6.92)
sin θ
Reconstruction of the wavefront: A similar reference beam is allowed to
fall on the hologram which acts as a diffraction grating (grating separation
d = λ/sin θ) and produces diffraction images. The angular separation between
the central maximum and the first maximum on the two sides, is [see Fig. 6.5(b)]
Interaction with External Fields 195

Reflected beam y

q
3

1
Reference beam d = l/sin q
2
l

(a)

q¢ = q
3

2
q¢ = q

(b)

Fig. 6.5 (a) Interference of the plane wave object beam and the reference
beam producing a hologram. (b) The hologram producing three
components one of which is the same as the object beam.

λ
sin θ′ =
d
= sin θ (6.93)
or θ′ = θ. Therefore, of the two maxima, one of the them has the same
directionality as the beam reflected from the object had. The wavefront
corresponding to this maximum is the same as that of the reflected beam and
hence produces a three dimensional image. Analytically, the modulated
amplitude is
B = IA exp (i2π – ct)/λ) |x = 0
= A(A2 + R2) exp (– iωt) + A2R exp [– i(ωt + 2πy (sin θ)/λ)]
+ A2R exp [– i(ωt – 2πy(sin θ)/λ)] (6.94)
where I given in Eq. (6.91) has been used. The first component corresponds to
the central maximum, the second component corresponds to the first maximum
along the direction of the original beam (it is to be noted that this wave moves
downward as t increases), and the third component represents the other first
maximum (the wave moves upward as t increases).
196 Elements of Modern Physics

If the object beam is a diverging beam (as in normally the case), it is easily
seen that the maximum intensity points 2 and 3 both move up with respect to 1
(see Fig. 6.6). As a consequence, the lower beam will also be diverging and will
appear to start from the object, whereas the upper beam will converge to a real
image of the object. This is seen by constructing wavefronts with circles of
radius b with centre at point 1, radius b ∓ λ with centre of point 2, radius b ± λ
with centre at point 3. The upper signs correspond to the lower beam and the
lower signs correspond to the upper beam. It may also be noted that in viewing
a hologram, the reference beam may have a frequency different from that of the
beam used for recording the image. If the wavelength of the second reference
beam is longer, the image observed will be magnified.
Holograms are very useful in studying the conditions at different levels by
focussing the microscope at different planes of the reconstructed image, e.g., in
the investigation of sizes and distribution of particles, mechanical strains, etc.

Laser Cooling
In 1985, the group of, S. Chu and co-workers (among them Ashkin and J.E.
Bjorkholm) at Bell Laboratories, Holmdel, NJ, reported success in cooling a
dilute vapour of about 105 neutral sodium atoms in a volume of 0.2 cm3 to a
temperature of about 0.2 mK.

Mirror
AR coated UHV window

PMT Puffing and pre-cooling


589 nm filter beams

Light baffles

Mirror

LN2 cooled baffle

Sodium pellet

30 cm
Sample
manipulator

Fig. 6.6 Schematic drawing of the vacuum chamber, intersecting laser beams and
atomic beam used for the Doppler cooling experiment. The laser beams enter the
UHV windows vertically and horizontally.
Interaction with External Fields 197

The development of methods to cool dilute vapours of trapped atoms has


made it possible to construct atomic clocks useful for precise timekeeping, e.g.,
in connection with navigation in space and the exploration of the solar system.
Another application using laser cooling is the development of atomic
interferometers in which the de Broglie wavelength of slow atoms is used for
interferometric measurements with ultrahigh precision, e.g., the acceleration of
gravity. It has become possible to develop instruments for atom optics to achieve
atomic lithography. The atomic beams may be used to form nanometer structures
on surfaces, e.g., for electronic components. The recent observation of a Bose-
Einstein condensation in a dilute atomic gas is also achieved using laser cooling
and trapping.
(Source: http://www.nobelprize.org/nobel_prizes/physics/laureates/1997/
back.html)

6.7 SOME EXPERIMENTAL METHODS


In this section, some experiments which are important for the study of atomic
and molecular properties are described.
S

1
2

P
(a)
S S¢

2
q¢ = q
P

Observer
(b)

Fig. 6.7 (a) Interference of the object beam and the reference laser beam,
on the photographic plate, producing a hologram, (b) the hologram
producing three components, one of which has the same
wavefront as the object beam.
198 Elements of Modern Physics

Magnetic Resonance Experiments


These experiments depend on the interaction of an atomic or a nuclear system
that has a nonzero magnetic moment, with a static magnetic field and a radiation
field. The static field splits the levels into different components, and the radiation
field with a particular frequency, called the resonance frequency, induces
transitions between these levels. Analysis of these transitions leads to information
about the magnetic moment of the system. The resonance experiments are
described as electron paramagnetic resonance (abbreviated as epr) when applied
to electronic magnetic moments, and as nuclear magnetic resonance (abbreviated
as nmr) when applied to nuclear magnetic moments.
When an atom with nonzero electronic magnetic moment is placed in a
magnetic field, each level with total angular momentum quantum number J
splits into 2J + 1 Zeeman levels [see Eq. (6.14)] with the energy difference
between the energy levels being
e
∆E = gB (6.95)
2m
If now a radiation field of frequency ω is incident, then transitions between
the different states take place with the absorption or emission of a photon, if
e
ω = gB (6.96)
2m
They can be discussed along the same lines as in Sec. 6.3, except that the
interaction energy is due to the interaction of the magnetic moment with the
time-dependent magnetic field associated with the radiation. These transitions,
known as magnetic dipole transitions, satisfy the selection rule
∆MJ = ± 1, 0 (6.97)
so that, within the Zeeman multiplets, the allowed transitions are only between
adjacent levels. The frequency of the inducing radiation for B = 104 G (i.e.,
1 Wb/m2), is of the order of
eB
ω~
m
= 1011 rad/s (6.98)
which corresponds to λ ≈ 2 cm, and is in the microwave frequency range. It
may be recollected that radiation induces emission and absorption with equal
probability [Eq. (6.52)]. Since at thermal equilibrium, there are more atoms in
the lower state [see Eq. (6.82)], there is a net absorption of energy by the atoms.
In practice, the paramagnetic substance is placed inside a resonance cavity
suspended between the poles of an electromagnet. Radiation of a given frequency
is transmitted by a waveguide, made to interact with the substance, and is
collected by a receiver and recorded. In the course of the experiment, the
Interaction with External Fields 199

magnetic field produced by the electromagnet is gradually increased. As the


value of B passes through the critical value satisfying Eq. (6.96), an intense
absorption is observed. Since the experiments are done mostly for crystalline
or liquid paramagnetic substances in which the energy levels of the atoms are
perturbed by the internal fields, the resonance relation in Eq. (6.96) varies slightly
from atom to atom. As a result, the absorption curve for the intensity of the
transmitted radiation, as a function of B, has a finite width.
The nucleus of an atom has a magnetic moment given in Eq. (4.65),
e
µN = g N I (6.99)
mp
which is about 1000 times smaller than the magnetic moment of the electrons.
Under the influence of the external field, the nuclear level splits into (2I + 1)
Zeeman levels and resonance absorption is observed for the frequency ω,
e
ω = gN B (6.100)
mp
For B = 104 G, this corresponds to a frequency of about 108 rad/s which is in
the radio-frequency range. Nuclear magnetic resonance is especially useful for
the study of atoms and molecules which may have zero electronic magnetic
dipole moment. These atoms generally have a nonzero nuclear magnetic moment
and can be studied by the nuclear magnetic resonance techniques.
The epr and nmr techniques can be used for identifying the presence of
certain elements, for determining the environment of the electron or the nucleus
(by noting the shift in the resonance frequency due to the environment), and
also for accurate measurement of magnetic fields.

Atomic and Molecular Beam Experiments


While magnetic resonance experiments are almost universal in their applications,
their accuracy is limited by the fact that they are based on the differential
population of nearby levels at thermal equilibrium and on the measurements of
changes in the radiation intensity. If the material is available in the form of
atomic or molecular beams, more accurate beam experiments can be performed.
Atomic and molecular-beam experiments are refinements of the Stern-
Gerlach experiment (Sec. 4.2) due to Rabi, incorporating the observation of
magnetic resonance. For simplicity, consider a beam of particles with the nuclear
angular momentum characterized by I = 1/2 and an associated magnetic moment.
In a typical set-up, the beam traverses three regions with magnetic fields B1, B2
and B3 produced by magnets 1, 2 and 3 respectively (see Fig. 6.7). The first
field B1 is inhomogeneous with the gradient as shown, and splits the beam into
two components with MI = ± 1/2 one of which, say with MI = – 1/2 is eliminated
200 Elements of Modern Physics

at the wall, while the second component with MI = 1/2 moves along the trajectory
shown. The second field B2 is homogeneous and introduces an energy difference
e
∆E = g N B2 (6.101)
mp
between the energy levels. In this region, there is also a radiation field of radio
frequency ω (ω ~ 108 rad/s). If ω satisfies the resonance condition
e
ω = g N B2 (6.102)
mp
some of particles will undergo resonant transition to the MI = – 1/2 state. The
third field B3 also is inhomogeneous but has a gradient opposite to that of B1,
which will remove the particles with MI = – 1/2, at the wall, while those with
MI = 1/2 pass along the trajectory shown and register in the detector.
¶B1
¶z

B1 w

Beam
Detector
B2
B3

¶B3
¶z
Fig. 6.8 Schematic diagram of the atomic/molecular beam resonance experiment.
The dashed lines indicate the components removed.
In the actual experiment, the frequency ω is held fixed and the field B2 is
varied. When the resonance condition in Eq. (6.102) is satisfied, some of the
particles undergo transition to the MI = – 1/2 state and are removed at the wall,
which reduces the recorded beam intensity. The value of the field B2 at which
the minimum beam intensity is recorded can be used to calculate the value of gN
and hence the magnetic moment of the particles. For example, the reduction in
intensity is observed for 31P, at B = 104 G and ω = 1.08 × 108 rad/s which gives
a value of gN = 1.13.
The application of nuclear magnetic resonance best known to the general
public is magnetic resonance imaging (MRI) for medical diagnosis and magnetic
resonance microscopy in research settings, however, it is also widely used in
chemical studies, notably in NMR spectroscopy such as proton NMR,
carbon-13 NMR, deuterium NMR and phosphorus-31 NMR. Biochemical
information can also be obtained from living tissue (e.g. human brain tumors)
Interaction with External Fields 201

(see Fig. 6.9 ) with the technique known as in vivo magnetic resonance
spectroscopy or chemical shift NMR Microscopy.

Fig. 6.9 Medical MRI.

Raman Effect
It was observed in 1928, by Raman and Krishnan, and simultaneously by
Landsberg and Mandelshtam, that the spectrum of light scattered by gases, liquids
and crystals, contains apart from the unshifted original frequency ω, new lines
whose frequencies are given by
ω′ = ω ± ω1 (6.103)
This is known as Raman effect, or more descriptively, as combination
scattering of light.
The process of scattering of radiation may be regarded as being made up of
absorption of the incoming photon and emission of the outgoing photon. If, as
a result, the final state of the atom or molecule is the same as the initial state, the
frequency of the photon is unchanged, giving the unshifted line. This process is
known as Rayleigh scattering. On the other hand, if the final state of the atom
or molecule is different, the process is an inelastic scattering of the photon and
the frequency ω′ of the final photon is given by the energy conservation relation
ω + E′ = E + ω (6.104)
E − E′
or ω′ = ω + (6.105)

The shifted frequency is less than the original frequency if E < E′ and the
corresponding lines are called the Stokes lines. It is more than the original
frequency if E > E′ and the associated lines are called the anti-Stokes lines. At
ordinary temperatures, there are more particles in the lower energy states, so
that there are more transitions with E1 → E2 than those with E2 → E1, E2 > E1.
Therefore, anti-Stokes lines are generally fainter (in some cases not even
observable) than the Stokes lines. The anti-Stokes lines increase in intensity as
the temperature is raised since this will increase the relative population of the
higher energy states.
202 Elements of Modern Physics

Since Raman effect is a two-step process, the selection rules can be deduced
from those for the two separate steps. In particular, the selection rules for the
transition between the rotational states of molecules, are ∆J = ± 1 for emission
or absorption of photons, and hence Raman effect is observed for transitions
with
∆J = ± 2, 0 (6.106)
For purely rotational transitions, only the ∆J = ± 2 transitions need be
considered (∆J = 0 does not involve changes in energy). The change in the
energy in the case of diatomic molecules, is given by
2
∆E = ± [ J ( J + 1) − ( J − 2) ( J − 1)]
2I
2
= ± (2J − 1) J ≥ 2 (6.107)
2I
where J refers to the higher state. For J = 2, ∆ω = ± 32/I, for J = 3, ∆ω = ± 52/I, for
J = 4, ∆ω = ± 72/I, etc. These lines are illustrated in Fig. 6.8. It is instructive to
compare them with the equi-spaced rotational levels in absorption spectra [see
Fig. 5.13(a)].
For transitions which involve changes in the vibrational states, ∆J can be
0 or 2. These involve larger changes in energy and hence anti-Stokes lines are
generally very faint. The frequency shift for a change in the vibrational state
but with ∆J = 0, corresponds to the missing central line in the vibrational-
rotational spectrum. The spacing of the ∆J = 2 lines about the ∆J = 0 is given by
Eq. (6.107). Raman spectra, involving changes in the vibrational states, provide
useful information about the structure of the molecules.
Raman spectra are characteristic of the molecules (and atoms) and are
extremely useful in the analysis of the complicated mixtures of molecules,
especially of organic molecules. They are also important in the determination
of the rotational and vibrational levels, and in the analysis of the structures of
the molecules.
J=4

1
0
n – 5n1 n + 5n1

n – 7 n1 n – 3 n1 n n + 3n1 n + 7n1

Fig. 6.9 The Raman spectrum for transitions within the rotational levels,
v1 is the spacing of the rotational levels (see Fig. 5.13).
Interaction with External Fields 203

Measurement of Lifetimes
The measurement of lifetimes of excited atoms and molecules, being of the
order of 10–8 s, is difficult. A few techniques of measuring short lifetimes are
discussed here.
The lifetimes can be obtained by measuring the intensity of radiation from
a collection of excited atoms, as a function of time. A voltage pulse is used to
excite the atoms by electron bombardment. The pulse starts the multi-channel
analyser in which channel n is active during the time nδ to (n + 1)δ, δ being a
time interval short compared with the lifetime τ of the atoms. The channel
records the pulses produced by the photoelectrons generated by the radiation
emitted. The pulse intensity is proportional to the number of excited atoms, and
hence its time dependence allows us to calculate the lifetime [from Eq. (6.76)].
One of the difficulties is that the population in the decaying state may be
continuously replenished by the particles in a higher excited state decaying to
the lower excited state under consideration.
In another method for measuring lifetimes of excited ions, called the beam-
foil technique, fast moving ions (accelerated by a potential difference) are excited
by passing them through a thin foil. The intensity of radiation emitted as a
function of the distance these excited ions travel, gives us information about
the number of excited states as a function of time, and hence allows us to calculate
the lifetime τ [from Eq. (6.76)].
An indirect method of calculating the natural lifetime is to measure the
linewidth of the level, and use the relation τ = 1/∆ω (essentially the uncertainty
relation) to deduce the lifetime of the state. In this method, the Doppler linewidth
and the collision linewidth (collisions affect the lifetime of a state), must be
taken into account in isolating the natural linewidth from the total observed
linewidth (∆ω used in the uncertainty relation is the natural line width).

6.8 EXAMPLES
The discussion in this chapter is now supplemented with some technical details
and examples.

Example 1
Here, the proof of the important theorem stated in Sec. 6.2 is outlined. To prove
the equality in Eq. (6.8), the z-axis is taken along the direction under
consideration. Then it has to be proved that

∫ψ* J , MJ Sz ψ J , MJ ′ ∫
dτ = a ψ *
J , MJ
I z ψ J , M J ′ dτ (6.108)

for [L, S] = 0, J = L + S.
204 Elements of Modern Physics

Consider first the case MJ ≠ MJ′. Since [Sz, Jz] = 0,

∫ψ* J , MJ (Sz J z − J z Sz )ψ J , MJ ′ dτ = 0 (6.109)


This leads to (after integrating the second term by parts),


( M J ′ − M J )  ψ *J , M J Sz ψ J , M J ′ dτ = 0 (6.110)

*
or ∫ ψ j, M J
Szψ
J , M ′J
d τ = 0, for MJ ≠ M′J (6.111)

Since the right hand side of Eq. (6.108) is

aM J ′  ∫ ψ *J , M j ′ ψ J , M J ′ d τ = 0, M J ≠ M J ′ (6.112)
the equation is satisfied for MJ ≠ MJ′.
To prove Eq. (6.108) for MJ = MJ′, it is observed that

∫ ψ *J , M J +1 Sz ψ J , M J +1 d τ − ∫ ψ *J , M J S z ψ J , M J d τ = ah (6.113)
where a is a constant independent of MJ. While this result is plausible in the
sense that every increment of MJ (or Jz) may be expected to cause an increase in
the average value of Sz, which depends only on the increment of MJ and not on
MJ itself, it is quite difficult to prove it (see Ref. 22, p. 236). This result then
leads to

∫ ψ *J , M J S z ψ J , M J d τ = aM J + b (6.114)
It is then noted that if all the angular momenta. J, L and S take opposite
values, the average value of Sz also should change its sign:

∫ ψ *J , − M J S z ψ J , − M J d τ = − ∫ ψ *J , M J S z ψ J , M J d τ (6.115)
Substituting Eq. (6.114) in this relation give b = 0. Since  MJ is the
eigenvalue of Jz, the required relation is obtained as

∫ ψ *J , M J S z ψ J , M J ′ d τ = a ∫ ψ * J , M J J z ψ J , M J ′ d τ (6.116)
in which both the sides are zero for MJ ≠ MJ′. This proves the equality in
Eq. (6.8).

Example 2
The interaction of an atom with an external constant electric field E in the
z-direction, is obtained from Eq. (6.5):

H′ = e ∑ zi | E | (6.117)
i
Interaction with External Fields 205

This interaction has the interesting new feature that it becomes indefinitely
large and negative as zi → – ∞, so that the electrons in an atom can tunnel
through the potential barrier and ultimately escape to infinity (z → – ∞). Thus,
there are no longer any true bound stages, and each level (including the ground
state) acquires a linewidth due to the fact that it has a finite lifetime (∆ω ~ 1/τ).
Also, the first-order energy shift given by Eq. (3.125) is zero for nondegenerates
states,

∫ ψ *n zi ψ n d τ = 0 (6.118)
Since zi is odd and | ψn | is even (nondegradable states are even or odd
2

under parity). For degenerate states, which one encounters in the hydrogen
atom, the problem is more complicated.
Consider the 2p and 2s states of the hydrogen atom. From Eq. (3.127), it is
easy to show that the energies of the mi = ± 1 states are unperturbed. For the
ml = 0 states, the energy shifts are given by Eq. (3.128), with 1 standing for the
l = 1, ml = 0 state and 2 standing for the l = 0, mi = 0 state. For this case, V11 = V22
= 0, so that x = ± 1 and the energy shifts are

∆E = ± e | E | ∫ ψ (1)*
2 z ψ (2)
2 dτ (6.119)
Thus, for the hydrogen atom, in addition to acquiring linewidths, the spectral
lines split into several components (in this discussion spin-obrit interaction has
been neglected). e.g. the n = 2 → n = 1 line splits into three components. This is
known as Stark effect. As the strength of the electric field becomes large
(>
~ 10 V/m), higher order corrections have to be included, and one has what is
7

known as the quadratic Stark effect to distinguish it from the first order effect
which is called the linear Stark effect.

Example 3
The Zeeman splitting for the hydrogen 2p → 1s transitions is given by
Eqs. (6.22) and (6.23).
For B = 104 G (1 Wb/m2), ∆ω0 = 8.78 × 1010 rad/s
so that
∆λ ≈ (± 2/3, ± 4/3) (0.0069) Å for 2 2P1/2 → 1 2S1/2 (6.120)
≈ (± 1/3, ± 1, ± 5/3) (0.0069) Å for 2 P3/1 → 1 S1/2
2 2

The shifts are observed to be very small.

Example 4
Here, the lifetime of the 2p state of the hydrogen state is calculated using
Eq. (6.70).
206 Elements of Modern Physics

Without any loss of generality, we assume that the atom is originally in the
l = 1, ml = 0 state. Using the wave functions given in Sec. 4.1,
exp (− r/a1 ) r cos θ
| (r)1s, 2p | = | ∫ ( πa13 )1/2
z
(32 π a15 )1/2
exp (– r/2a1) r2dr 2πd cos θ |

215/2
= a1 ≈ 0.745 a1 (6.121)
35
Substituting this in Eq. (6.70) and using the relation in Eq. (6.77) the lifetime
τ of the 2p state (∆E = ω = 12.0 eV) is
τ ≈ 1.6 × 10–9 s (6.122)
which is in agreement with the experimental observation.

Example 5
The ratio of spontaneous transitions to stimulated transitions for particles in
thermal equilibrium can be obtained from Eq. (6.63). Denoting the probability
for spontaneous transitions by P1 and that for stimulated transition by P2,
Eq. (6.63) reads
P2 + P1 = exp ( ωnm/kT) P2 (6.123)
or P1/P2 = exp ( ωnm/kT) – 1 (6.124)
For very low temperatures, the transitions are predominantly spontaneous
but become predominantly stimulated for high temperatures. This is to be
expected since the radiation density increases with temperature. For example,
at room temperatures, the transitions between 2p and 2s states of the hydrogen
atom ( ωnm ~ 10 eV, kT ~ 0.026 eV) are predominantly spontaneous. However,
in some of the hot stars, the surface temperatures are as high as 30000 K, so that
there stimulated transitions also are important.

Example 6
Lasers provide an intense, collimated beam. To estimate the power, consider
the original ruby laser which had a diameter of 1 cm and a length of 5 cm. If the
ruby has about 1019 Cr atoms/cc, and all of them are excited, the total energy
available is

π 
E = 1019  5  × (hv)
4 
= 11.25 J (6.125)
If the pulse lasts for about 10–7 s, the power during this period is about
108 W.
Interaction with External Fields 207

The angular spread of the beam is


α ~ 1.10 × 10–5 rad (6.126)
In travelling a distance of 100 m, the beam will spread by about
∆x ~ 0.11 cm (6.127)

Example 7
If an excited state is replenished by decays from a higher excited state, the
decay rate is not given by the simple exponential function in Eq. (6.76).
Consider three states with energies E0 < E1 < E2 and decay probabilities A10,
A20 and A21. Then the changes in N1 (t) in time dt are
dN1 (t) = – A10N1 (t) dt + A21N2 (t) dt (6.128)
dN2 (t) = – A2 N2 (t) dt (6.129)
where A2 = A20 + A21. The solutions to these equations are
N2 (t) = exp (– A2 t) N2 (0)
A21
N1 (t) = exp (– A10 t) N1 (0) +
A10 − A2
(exp [– A2 t] – exp [– A10 t]) N2 (0) (6.130)
It is observed that if A21 N2 (0) > A10 N1 (0), N1 (t) will increase for small
t but will eventually start decreasing.

PROBLEMS
1. Obtain the energy levels of the (1s)2 1S0, (1s) (2p) 1P1, and (1s) (3d) 1D2
states of the helium atom in the presence of a magnetic field of strength
1 Wb/m2. What are the shifts in Å for the allowed transitions, if lP → S =
584.4 Å, λD → P = 6678 Å for the unperturbed states under consideration?
2. Describe the Zeeman patterns of the 2D3/2 and 2P3/2 states. Calculate the
frequency shifts for the transitions 2D3/2 → 2P3/2 with ∆MJ = 0, ∆MJ = 1,
and ∆MJ = – 1. If the sodium line with λ = 8195 Å corresponds to a
2
D3/2 → 2P3/2 transition, what is the maximum shift in its wavelength when
a magnetic field of 1 Wb/m2 is introduced?
3. Obtain the shifts in the frequency for 2D3/2 → 2P1/2 transitions in the presence
of a weak magnetic field.
4. What is the magnetic moment of sulphur in the ground state 3P2? In the
presence of a magnetic field of 2 Wb/m2, what is the resonance frequency?
5. Indicate the energy levels of 2P and 2D states and the allowed transitions
between them in the presence of a strong magnetic field.
208 Elements of Modern Physics

6. For N14, magnetic resonance is observed at a B = 1 Wb/m2 and for a


frequency of ω = 1.933 × 107 rad/s. What is the value of the Landé
g-factor for N14 (I = 1 for N14)?
7. Interstellar space is occupied by atomic hydrogen. The I = 1 and I = 0
states are split by hyperfine interaction (the I = 1 state is higher than the
I = 0 state), the energy separation being given by the 21 cm line. If
interstellar space is characterized by a temperature of T = 3 K, what is the
ratio at this temperature of spontaneous transitions to stimulated transitions
between these states?
8. There are three states with energies E1 < E2 < E3, whose populations are in
thermal equilibrium. If an external radiation induces transitions between
the states with energies E1 and E3 till they have equal populations, show
that, in general, there is population inversion between the E1, E2 states or
the E2, E3 states. What is the condition for the exception?
9. A helium-neon gas laser of 2.0 mW power, operates on 220 V-2.0 A power
supply. What is the working efficiency of the laser?
10. HCl molecules are traversed by the Hg 2536.5 Å radiation. If the moment
of inertia of HCl is 2.7 × 10–47 kg.m2, what are the wavelengths of the first
and second rotational Raman lines?
7
Quantum Statistics

Structures of the Chapter


7.1 Distinguishable arrangements
7.2 Statistical distributions
7.3 Applications of Maxwell-Boltzmann distribution
7.4 Applications of Bose-Einstein distribution
7.5 Applications of Fermi-Dirac distribution
7.6 Superconductivity
7.7 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 209
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_7
210 Elements of Modern Physics

It is clear from earlier discussions that a complete description of even one-


electron atom is quite complicated. The complexity of the problems increases
rapidly as the number of particles increases, so as to make a detailed solution of
a many particle system almost impossibly difficult to obtain. However, as the
number of particles becomes very large, say of the order of 1023 as in the case of
macroscopic bodies, the very largeness of the degrees of freedom leads to the
result that the average properties (macroscopic and in some cases, microscopic)
of the system correspond to, statistically, the most probable behaviour of the
system. This feature of many-particle systems forms the basis of the quantum
statistical description of their properties.
The statistical energy distributions of a collection of particles are discussed
first and then some important applications based on the most probable
distributions are considered.

7.1 DISTINGUISHABLE ARRANGEMENTS


The basic assumption for obtaining the most probable statistical distribution is
that every physically distinct arrangement of particles in the various available
states is equally likely to occur. This implies that the most probable distribution
is the one which has the largest number of distinguishable arrangements
associated with it. Therefore, the procedure for determining the most probable
statistical distribution involves two steps: (i) obtaining the number of
distinguishable arrangements which give rise to the same distribution, and
(ii) maximizing this number of arrangements with respect to different
distributions. In this section, expressions for the number of distinguishable
arrangements are obtained for a given distribution.
Consider a collection of N particles, which interact weakly with each other
and with the wall. Since these particles interact only weakly, each particle will
have a set of states with well-defined energies available to it. Let their possible
energies be grouped into cells of sizes ∆ε1, ∆ε2..., ∆εi ..., with average energies
ε1, ε2,..., εi, ..., respectively. These particles may belong to one of the following
three classes of particles.
1. There are Q identical but distinguishable particles. These are classical
particles whose trajectories may, in principle, be followed. There are fi
number of states and qi number of particles in the i-th energy cell.
2. There are R identical, indistinguishable bosons with integral spin. The
i-th energy cell has gi number of states and ri number of these bosons
(gi ≠ fi in general).
3. There are S identical, indistinguishable fermions with half-integral spin.
The ith energy cell has hi number of states and si number of these fermions
(hi ≠ fi or gi, in general ). It should be noted that no two of these fermions
can be in the same state, so that hi ≥ si.
Quantum Statistics 211

Our aim is to determine the number of different, distinguishable


arrangements for each of these distributions. The problem is similar to that of
determining the number of distinguishable ways in which Q identical balls can
be placed in different boxes, so that there are q1 balls in the first box with
f1 shelves, q2 balls in the second box with f2 shelves, etc. (each shelf corresponds
to an energy level). In the case of classical particles, since they are
distinguishable, the balls can be thought of as having different colours. In the
case of bosons and fermions, the balls are identical in every way including their
colour. However, for fermions, there is the additional restriction that at most
one ball can be placed in each shelf.
Identical classical particles: We first determine the number of ways in which
Q particles can be grouped into distinguishable sets of q1, q2, .., qi,..., particles
and then the number of ways in which qi particles can be distributed among
fi states of the i-th cell.
The first particle can be chosen in Q number of ways, the second in (Q –1)
ways, etc. However, since the different orders of choosing the same set of
q1 particles in the set lead to the same result, there are
Q (Q – 1)  (Q − q1 + 1)
P1 (q1) =
q1 !
Q!
= (7.1)
(Q − q1 ) ! q1 !
number of ways choosing q1 distinguishable particles from Q classical particles.
Similarly, form the remaining Q – q1 particles, q2 distinguishable particles can
be chosen in ways. Proceeding in this way, the total number of ways of choosing
(Q − q1 ) !
P2(q2) = (7.2)
(Q − q1 − q2 ) ! q2 !
distinguishable sets of q1, q2 ,..., qi, ... particles from Q distinguishable particles
is found to be
Q!
P1(q1) P2 (q2) ... Pi (qi) ... = (7.3)
q1 ! q2 ! qi !
Now each of the qi particles can occupy any one of the fi states so that there
are fiqi number of ways of distributing qi distinguishable particles in fi states.
Thus the total number of distinguishable arrangements for the distribution of
q1, q2, ..., qi, ... sets of distinguishable practices in f1, f2, .., fi, ... states is
q
∞ fi i
P (qi) = Q! Π (7.4)
i =1 qi !
212 Elements of Modern Physics

Identical bosons: Since the particles are indistinguishable, there is only one
way of grouping R particles into distinguishable sets of r1, r2,..., r1, ... particles.
Therefore, the total number of distinguishable arrangements is given by just the
product of the number of ways in which ri particles are distributed among gi
number of states.
For determining the number of ways in which ri particles are distributed
among gi number of states, the states are regarded as being separated by portions.
Since no partition is needed at the ends, gi – 1 number of partitions is needed.
Then the particles and the partitions are arranged in a row, e.g.
××||×|×|××× (7.5)
where each × represents a particle, the vertical line represents a partition, and
the arrangement shown represents 2, 0, 1, 1, 3 particles in five states (four
partitions). The number of such distinguishable arrangements in the i-th cell is
given by the number of different ways of arranging (ri + gi – 1) objects of which
ri particles and gi – 1 partitions belong to two groups of indistinguishable objects
and is
( ri + gi − 1) !
Pi(ri) = (7.6)
ri !( gi − 1) !
Therefore, the total number of distinguishable arrangements for the
distribution of r1, r2, ..., ri, ... sets of bosons in g1, g2, ... gi,... states is
∞ (ri + g i − 1) !
P(ri) = Π (7.7)
i = 1 r !( g − 1) !
i i

Identical fermions: Here again, there is only one way of grouping S particles
into distinguishable sets of s1, s2, ...si, ... particles. For obtaining the number of
ways of distributing si particles in hi states, it is noted that each state can be
occupied by at most one particle so that the states may be arranged in a row of
hi objects, indicating the occupation of each state, e.g.
00××00 (7.8)
where 0 indicates that the level is unoccupied, × indicates that the level is
occupied by one particle, and the particular arrangement represents 0, 0, 1, 1, 0,
0 particles in the six energy levels. Therefore, the number of such distinguishable
arrangements in the i-th cell is given by the number of different ways of arranging
hi objects of which si and hi – si belong to two groups of indistinguishable objects:
hi !
P(si) = (7.9)
si !(hi − si ) !
Hence the total number of distinguishable arrangements for the distribution
of s1, s2, ..., si, ... sets of fermions in h1, h2, ..., hi …states is
∞ hi !
P(si) = iΠ (7.10)
=1 si !(hi − si ) !
Quantum Statistics 213

7.2 STATISTICAL DISTRIBUTIONS


The number of distinguishable arrangements for a given distribution of a mixture
of the three classes of particles is

∞  f qi   ( r + g – 1)!   hi ! 
P(qi, ri, si) = (Qi) Π  i   i i
  (7.11)
 qi !   ri ! ( g i – 1) !   si !( hi − si ) !
i

The number of particles of each class should be conserved. Assuming that


the total energy of the system is E,


i
qi = Q, ∑ ri = R, ∑ si = S (7.12)
i i

∑i εi (qi+ ri + si) = E (7.13)

The most probable distribution corresponds to the maximum of P (qi, ri, si),
subject to the conditions (7.12) and (7.13).
In practice, it is more convenient to maximize ln P (qi, ri, si). The calculations
are greatly simplified by using the following approximation (Stirling’s formula):
ln n ! = ln 2 + ln 3 + ...+ ln n
n + 1/2

= ∫
1
ln x dx + 0(1)

≈ n ln n – n (7.14)
where for large n only the first two leading terms have been retained. Then
keeping only the leading terms gives

ln P = Q ln Q − Q + ∑ i
(qi ln fi – qi ln qi + qi)

+ ∑ [ (ri + gi ) ln (ri + gi) – (ri + gi )


i

–ri ln ri + ri – gi ln gi + gi]

+ ∑ [ hi ln hi – hi – si ln si + si
i

– (hi – si) ln (hi – si) + (hi – si)] (7.15)


At the maximum, this expression has to be stationaryfor small but arbitrary
changes in qi, ri and si subject to the constraints (7.12) and (7.13). Taking the
differential of ln P, one gets
214 Elements of Modern Physics

∞ ∞

δ (ln P) = ∑ δ qi [ln f i − ln qi ) + ∑ δr [ln (r − g ) − ln r ]


i =1
i i i i
i =1


+ ∑ δ si [ ln (hi − si ) − ln si ] = 0 (71.6)
i =1

subject to the conditions


∞ ∞ ∞


i =1
δqi = 0, ∑ δri = 0, ∑ δsi = 0,
i =1 i =1
(7.17)


i =1
εi (δ qi + δri + δ si ) = 0 (7.18)

Using the relations in Eq. (7.17) to eliminate δq1, δr1 and δs1 in Eqs. (7.16)
and (17.18) gives

 f q  ∞
 (r + gi ) r1 
∑ δqi ln  i 1  + ∑ δri ln  i 
i= 2  f1 qi  i= 2  ( r1 + g1 ) ri 

 (h − si ) s1 
+ ∑ δsi ln  i =0 (7.19)
i= 2  ( h1 − s1 ) si 

+ ∑ (εi − ε1 ) (δqi + δri + δsi ) = 0 (7.20)
i=2

From Eq. (7.20),



εi − ε
δq2 = – ∑ (δqi + δri + δsi ) + δq2 (7.21)
i=2 ε 2 − ε1
Using this in Eq. (7.19) to eliminate δq2, then regarding δq3, δq4 ..., δr2, δr3,
...,δs2, δs3 ..., etc., as arbitrary variables and equating their coefficients to zero
gives

fi q1 εi − ε1 fq
ln − ln 2 1 = 0, i = 3, 4, ...,
f1qi ε 2 − ε1 f1q2

(ri + gi ) r1 εi − ε1 fq
ln − ln 2 1 = 0, i = 2, 3,  (7.22)
(r1 + g1 ) ri ε 2 − ε1 f1q2

(hi − si ) s1 εi − ε1 fq
ln − ln 2 1 = 0, i = 2.3, 
(h1 − s1 ) si ε 2 − ε1 f1q2
These relations allow us to solve for the equilibrium distributions
Quantum Statistics 215

q1
qi = f i exp [– β (εi − ε1 )] (7.23)
f1

gi
ri = (7.24)
r1 + g1
exp [β (εi − ε1 )] − 1
r1

hi
si = (7.25)
h1 − s1
exp [β (εi − ε1 ) + 1
s1

1 fq
where β= ln 2 1 (7.26)
ε 2 − ε1 f1q2
These equations are identified for q1 q2, r1 and s1, so that they are valid for
all i. It is often the case that the number of bosons is not restricted. In this case,
δr1 also is an independent variable. Following the same steps as before gives
instead of Eq. (7.24),
gi
ri = , ∑ ri = unrestricted (7.27)
exp (β εi ) − 1 i
In order to determine fi, gi and hi, for the translational levels, it is assumed
that the system is in a cubic box of length I, (the results are valid for other
shapes as well e.g., rectangular shape) for which the energy levels are [see Eq.
(3.173)]
 2 π2 2
E= (nx + n y2 + bz2 ), nx = 1, 2 , etc. (7.28)
2ml 2
Since every set of positive, nonzero integers (nx, ny, nz) is associated with a
state, the number of states in the absence of internal degrees of freedom is
approximately equal to the volume in the first octant of the n-space. Therefore,
the expression for fi is
V 1/2
fi = (2m)3/2 ∈i ∆εi (7.29)
4 2 3
and similar expressions for gi and hi. From this, the total number Q of the
distinguishable particles and their total energy can be obtained as
Q = Σqi
3/2
 m  q1
=V  exp (βε1 ) (7.30)
 2π β 
2
f1
216 Elements of Modern Physics

E = Σqiεi
3/2
 m  q1  3 
= V    exp (βε1 ) (7.31)
 2π β  f1  2β 
2

It then follows that average energy is


3
ε = (7.32)

Since ε is equal to (3/2) kT for the classical particles where T is the absolute
temperature it follows that
1
β= (7.33)
kT
In terms of T, the various distributions can be written as
q = fi exp (– α1– εi/kT) Maxwell-Boltzmann, (7.34)
gi
ri = Bose-Einstein, (7.35)
exp (α 2 + εi / kT ) − 1
hi
si = Fermi- Dirac (7.36)
exp (a3 + εi / kT ) + 1
These are known as the Maxwell-Boltzmann (for distinguishable particles),
Bose-Einstein (for bosons), and Fermi-Dirac (for fermions) distributions,
respectively. The constant α1, α2 and α3 are determined from the conditions in
Eq. (7.12) for the number of particles.
The following properties of the distributions may be noted:
1. For large εi/kT when exp (α2,3 + εi/kT) >> 1, all the three distributions
have the some form, i.e. that of the Maxwell-Boltzmann distribution.
2. For distinguishable particles, the quantity qi/fi called the particle index,
satisfies the relation
qi /fi
= exp [ – (εi = εj)/kT] (7.37)
qi /f j
which implies that there is always a greater tendency for the particle to
occupy a lower energy state than a higher energy state. This tendency is
observed for bosons and fermions as well [see Eqs. (7.35) and (7.36)].
Equation (7.37) is valid for bosons and fermions also, if the 1 in the
denominator of Eqs. (7.35) and (7.36) can be neglected.
3. It is noted that when the total number of bosons is unrestricted, α2 = 0
and the distribution of the bosons is given by Eq. (7.27). In this case, it is
Quantum Statistics 217

observed that the bosons have a tendency to bunch together at low energies
[see Fig. 7.1 (a)]. Also, the number of bosons increases as T increases, at
all energies.
4. For the Fermi-Dirac distribution, si/hi is the probability for a state to be
occupied and is seen to be less than one for all εi as is required by the
Pauli exclusion principle. In general, ε3 is negative and it is convenient
to write
hi
si = (7.38)
exp [εi − ε f )/kT ] + 1
For T → 0, si/hi = 1 for εi < εf and si/hi = 0 for εi > εf. This means that
fermions occupy the lowest energy states available, subject to the
exclusion principle. For finite but small T, si/hi ≈ 1 for (εi – εf)/kT << – 1,
and si/hi ≈ 0 for (εi – εf)/kT >> 1. The quantity εf is called the Fermi energy
(which depends on T), and it plays an important role in the behaviour of
fermions. The distribution is illustrated in Fig. 7.1 (b).
5. In principle, every system of particles which interact weakly with each
other, is described by either the Bose-Einstein or the Fermi-Dirac
distribution. However, if the particles are localized (at the lattice points
for example) and their wave functions do not overlap, they can be taken
as being distinguishable (distinguished by the region of localization). In
such cases, Maxwell-Boltzmann distribution can be applied to describe
the system.
In what follows, some important physical properties of different systems
are deduced using the statistical distributions given in Sec. 7.2

1
ri/gi

1 2 1 – T = 2000 K
2 – T = 1000 K

0 1 2 3 4 5
e(eV)
(a)
218 Elements of Modern Physics

1
2 1 1–T=0K

si/hi
2 – T = 2000 K

0
0 1 2 3 4 5
e(eV)
(b)

Fig. 7.1 (a) Particle index for bosons with α2= 0, (b) particle
index for fermions with εf = 3.7 eV.

7.3 APPLICATIONS OF MAXWELL-BOLTZMANN


DISTRIBUTION
In this section, the specific heats of gases and solids are discussed in terms of
the Maxwell-Boltzmann distribution for distinguishable particles.

Specific Heats of Gases


As mentioned in Sec. 5.7, the spacing of the electronic energy levels of molecules
is of the order of 5 eV, whereas that of the vibrational levels is of the order of
1 eV or less, and that of the rotational levels is of the order 10–3 eV. Hence it
follows from Eq. (7.37) that at ordinary temperatures (kT = 0.026 eV. For
T = 300 K), while many rotational levels are excited almost all the particles will
be in the lowest electronic state. Therefore, the energy of each particle may be
written as:
1 2  1 2
E= p +  n +  ω + J ( J + 1) (7.39)
2m  2 2I
and the average energy as

∑ exp (− E /kT ) f E
i
i i i

E= (7.40)
∑ exp (− E /kT ) f
i
i i

Since the total energy is a sum of the energies of different modes of


excitation, it can be shown that

E = Etr + Evib + Erot (7.41)


where the average of each term is only over the states corresponding to that
mode of excitation.
Quantum Statistics 219

3
It was noted in Eq. (7.32) that Etr is kT. The average vibrational energy
2
is

∑ exp [– (n + 1/2) ω/kT ] (n + 1/2) ω


n=0
Evib = ∞

∑ exp [– (n + 1/2) ω/kT ]


n=0

(7.42)
This expression can be evaluated by using Eq. (2.11) and gives

Evib = 1 ω + (7.43)
2 exp (ω/kT ) − 1
where the first term is called the zero-point energy. The average rotational energy
is

∑ exp (– aJ ( J + 1)/kT ) aJ ( J + 1) (2 J + 1)
j=0
Erot = ∞ (7.44)

j=0
exp (– aJ ( J + 1)/kT ) (2 J + 1)

where a =  2 /2I and (2J + 1) is the degeneracy of the rotational levels.


For applying the above results to symmetric diatomic molecules, some
changes have to be made in the expression for Erot . For example, only even-J
states are allowed by Pauli’s exclusion principle for the para-hydrogen
molecules. In this case, closed expression is obtained for Erot by separating out
the J = 0 and J = 2 terms and converting the remaining sum into an integral.
This is done by first replacing J by 2l so that the summation is over l = 0, 1, 2,
... and then substituting x = l (2l + 1) for which ∆ x = (4l + 1) ∆l. This is leads to
1
Erot = − ∂ F/∂ (1/kT ) (7.45)
F

F ≈ 1 + 5 exp (– 6a/kT) + ∫6
exp (– 2ax/kT) dx,

(7.46)
where F is the denominator in Eq. (7.44). The lower limit corresponds to l = 3/2.
Carrying out the integration, we obtain for para-hydrogen,
220 Elements of Modern Physics

3 1 ω 1 ∂F
E = kT + ω + −
2 2 exp (ω/kT ) – 1 F ∂ (1/kT )
(7.47)
kT
F ≈ 1 + 5 exp (– 6a/kT) +
exp (– 12 a/kT)
2a
The values of ω and a are obtained from the spectrum of the hydrogen
molecule, and have the values
ω ≈ 0.5454 eV (7.48)
a ≈ 0.007 55 eV
The specific heat of para-hydrogen obtained from Eq. (7.47)
∂E
Cv = NAvo (7.49)
∂T
is plotted in Fig. (7.2) and is in very good agreement with the experimental
observations. It may be observed that the contribution to Cv from the rotational
energy becomes appreciable at T > ~ 75 K which corresponds to kT >
~ a, while the
contribution from the vibrational to kT >~ ω . Ordinary hydrogen is a mixture
of ortho- and para-hydrogen, there being about 25% para-hydrogen at room
1
temperature, I = for the hydrogen atom). The specific heat of the mixture is
2
a statistical average of the specific heats of the components. Its behaviour is
similar to that given in Fig. (7.2) except that the hump around
T ≈ 150 K is now absent.

4
Para-hydrogen
3

Cv/R 2

Ordinary hydrogen
1

10 20 30 50 100 200 300 500 1000 2000 3000 5000


T (K)

Fig. 7.2 The specific heat of para-hydrogen (solid line) and ordinary
hydrogen (dashed line) at constant volume, as a function of absolute temperature.
Quantum Statistics 221

Specific Heat of Solids


The specific heat of solids provides an important application of Maxwell-
Boltzmann distribution.
It is assumed that the atoms of a solid are localized and perform simple
harmonic motion about the equilibrium position. In the classical analysis, the
number of states is taken to be proportional to d3 p d3 r, so that the average
energy for the Maxwell-Boltzmann distribution is given by
  1 1 2  1 2 1 2 3
∫ exp −  2m p + p + br  d p d 3 r
2
br  kT  
2    2m 2 
Ec1 =
  1 2 1 2 
∫ exp  –  2m p + 2
br  kT  d 3 p d 3 r
 
(7.50)
The expression can be evaluated by using the result

∞ dn
∫ n ∫
2
exp (– ax ) x 2n
dx = (− 1) n exp (– ax 2 ) dx
0 da 0
dn  π1/2 
= (− 1)
n
 1/2  (7.51)
da n  2a 
and comes out to be
Ec1 = 3kT (7.52)
From this it follows that the specific heat is
Cv, c1= 3 R (7.53)
At about room temperatures and above, this result is in agreement with the
experimental observations (law of Dulong and Petit). However, the experimental
measurements (note that experimentally Cp is measured, though the difference
Cp – Cv is quite small for solids) show that, at low temperatures the specific heat
rapidly decreases as T decreases and goes to zero as T approaches 0, K.
Einstein (1911) was the first person to appreciate that the low-temperature
behaviour of the specific heat of solids is essentially governed by quantum
properties of the system. He suggested that the energies of the oscillating atoms
do not form a continuum. Instead, their allowed energies for oscillation in each
direction, are
εn = nhv, n = 0, 1, 2,... (7.54)
where v = 1/2 π (b/m) is the natural frequency of the oscillator. Therefore, the
1/2

average energy of the atom for oscillation in each direction is


∑ nhv e
n=0
– nhv / kT

ε = ∞

∑e
n=0
− nhv / kT
222 Elements of Modern Physics

hv
= hv / kT (7.55)
e –1
which is the same expression encountered in Eq. (2.11) for Planck’s oscillator.
The specific heat of the system including oscillations in all the three direction is

Cv = 3 N ε
δT
2 hv / λT
 hv  e
= 3R  (7.56)
 kT  (e
hv / kT
− 1) 2
For large T, this expression reduced to the classical expression of 3R but at
low temperatures it decreases rapidly and goes to zero ~ T–2 exp (– hv/kT) for
T → 0. Overall the expression describes the qualitative behaviour of specific
heat quite well. However, experiments show that Cv goes to zero more gently,
as T3 near 0 K, and not as an exponential function. Still, the result clearly indicates
that quantum oscillations govern the low temperature behaviour of the specific
heat of solids.
An improved description of the specific heat of solids was given by Debye
(1912) who observed that the motion of neighbouring atoms is correlated, and
that the allowed frequencies of oscillation correspond to those of allowed
standing elastic waves in the medium. The number of the allowed modes for
the standing waves was calculated in Sec. 2.1 [see Eq. (2.8)], and is given by
8 π Vv 2 dv
dNt(v) = (7.57)
vt3
for the transverse modes (which correspond to the oscillations of atoms
perpendicular to the direction of propagation of the waves—there are two
independent directions of transverse oscillations), where vt is the velocity of
propagation for the transverse modes, and by
4 π Vv 2 dv
dNi(v) = (7.58)
vt3
for the longitudinal mode (which corresponds to the oscillation of atoms parallel
to the direction of propagation of the waves), where vl is the velocity of
propagation for the longitudinal modes, V being the volume. However, since
the medium of propagation consists of discrete atoms, Debye assumed that the
total number of frequency modes is equal to the total number of degrees of
freedom, i.e. 3N0, N0 being the number of atoms. This imposes an upper limit vm
on the allowed frequencies,
v
2 1 m
3N0 = 4π V  3 + 3  − ∫ v 2 dv
 vt vl  0
Quantum Statistics 223

4πV  2 1  3
=  +  vm (7.59)
3  vt3 vl3 
Since each mode is associated with an average energy given by Eq. (7.55),
the total thermal energy is
vm
2 1 hv
E = 4πV  3 + 3 
 vt vl 
∫ 0 e hv / kT
−1
v 2 dv (7.60)

which in terms of Eq. (7.59) can be written as


vm
9 N0 hv
E=
vm3 ∫e
0
hv / kT
−1
v 2 dv (7.61)

Defining x = hv/kT and θ = hvm/k where θ is called the Debye temperature,


the energy is
3 θ/T
T  x3
E = 9 N 0 kT  
θ
∫0
ex − 1
dx (7.62)

The molar specific heat Cv is given by ∂E/∂T for N0 = NAvo.


1
In the limit T → ∞, θ/T → 0, the integral is (θ/T), so that
3
θ
Cv = 3R, for << 1 (7.63)
T
θ
which is the classical limit [Eq. (7.53)]. For T → 0, → ∞, one can use
T

x3 π4
∫0 e x − 1 dx =
15
(7.64)

3 4 θ
to get E= π N 0 kT (T/θ)3 , >> 1 (7.65)
5 T
12 4 θ
and Cv = π R (T/θ)3 , << 1 (7.66)
5 T
The model predicts that the specific heat at low temperatures is proportional
to T3 , in agreement with the experimental observation.
The behaviour of Cv at other temperatures has to be evaluated numerically
from the expression
224 Elements of Modern Physics

3 θ/T
T  x 4 e x dx
Cv = 9 R  
θ
∫ (e
0
x
− 1)2
(7.67)

obtained from Eq. (7.61), and gives a universal curve as a function of θ/T
(Fig. 7.3). The general agreement between theory band experiments is quite
good, θ being about 100 K for lead, 160 K for sodium, 220 K for silver, 340 K
for copper, 400 K for aluminium, 640 K for silicon, and about 1860 K for
carbon (diamond). Some of the observed differences at intermediate temperatures
can be explained by taking a more realistic spectrum for the allowed frequencies.

7.4. APPLICATIONS OF BOSE-EINSTEIN DISTRIBUTION


Bose-Einstein distribution describes the properties of bosons which may be
massless, such as photons, or massive, e.g. 4He. Some of their statistical
properties are considered here.

Photon Gas
In Sec. 2.1, Planck’s theory of blackbody radiation in terms of the allowed
standing waves and the associated harmonic oscillators was discussed. A more
modern and satisfactory description is in terms of the energy distribution of the
photons regarded as massless bosons.
Since the number of photons is unrestricted, their distribution is given by
Eq. (7.27),
gi
ri = (7.68)
exp (βεi ) − 1
where εi = hv. The number of energy levels is the same as the number of allowed
standing waves given by Eq. (2.8), except that the standing waves in Eq. (2.5)
are to be interpreted as the energy eigenstates of photons with energy eigenvalue
hv. Therefore, gi is
8πV 2
gi = v dv (7.69)
c3
V being the volume. The energy density per unit volume is

 8πh  v dv
3

U(v) dv =  3  hv / kT (7.70)
 c e −1
which agrees with Planck’s expression in Eq. (2.12)
Quantum Statistics 225

3R

Cv 2R

0
0 0.5 1.0 1.5 2.0
T/q

Fig. 7.3 The Debye specific heat as a function of T/ θ, θ being the Debye temperature.

Photon Gas
As in the case of electromagnetic waves and the photons, the elastic waves in a
solid have a quantum manifestation. The energy of these waves is in the form
quanta called phonons each of which carries a quantum of energy hv where v is
one of the allowed frequencies. These phonons are bosons, they interact with
the atoms, they are absorbed and emitted, and their total energy of the thermal
energy of the solid.
The number of phonons is unrestricted, so that their frequency distribution
is given by Eq. (7.27),
g
ri = βhv i (7.71)
e −1
Phonons are transverse or longitudinal and the number of energy levels is
given by Eqs. (7.57) and (7.58), with the upper limit vm for the frequency given
by Eq. (7.59). Therefore, the total energy of the phonon gas is
v
 2 1  m hv
E = 4πV  3 + 3  ∫ hv/kT v 2 dv (7.72)
v
 t vl  0 e − 1
which is the same as the relation in Eq. (7.60).

Bose-Einstein Condensation
A gas with a given number of bosons whose mass is nonzero, shows remarkable
quantum mechanical properties at low temperatures. In particular, it undergoes
a phase transition, known as Bose-Einstein condensation which is of interest
for two reasons. Firstly, it is an example which allows an exact mathematical
treatment. Secondly, the observed changes in the properties of 4He at T = 2.17 K
can be explained in terms of Bose-Einstein condensation.
226 Elements of Modern Physics

The distribution of a Bose-Einstein gas is given by Eq. (7.35) as


gi
ri = (7.73)
exp [(εi + α 2 ) / kT ] − 1
where the constant α2 is determined from the condition that the total number of
particles is N. Here α2 is different from that in Eq. (7.35)
 gi 
N= ∑  exp [(ε  (7.74)
i  i + α 2 ) / kT ] − 1 
The energies may be measured from the ground state energy which can be
taken to be zero. This implies that since ri is nonnegative, α2 ≥ 0.
The number of energy levels is obtained from Eq. (3.173) by taking
lx = ly = lz. It is given by the volume element in the first octan to the n-space,
gi = dnx dny dnz, nx = 1, 2, ..., ny = 1, 2..., nz = 1, 2, …,
1 2
πn dn
=
2
2πV (2m)3/2 1/2
= ε dε (7.75)
h3
where V is the volume. Therefore, using Eq. (7.74)
N 2πV (2m)3/2 ∞ ε1/2 d ε
V
=
h 3 ∫
0
exp [(ε + α 2 ) / kT ] − 1
(7.76)

where the left-hand side is independent of temperature. It then follows that as


T decreases, so does α2, and the smallest value of T allowed by Eq. (7.76) is the
one for which α2 = 0 (it should be noted that α2≥ 0). This minimum value of T,
called Tc, is given by

N 2π(2m)3/2 ε1/2 d ε
V
=
h3 ∫0 exp [(ε / kTc ) − 1 (7.77)

Writing x = ε/kTc, this relation reduces to


3/ 2 ∞
N 2  2πmk Tc  x1/2 dx
V
= 1/2 
π  h2 
 ∫e
0
x
−1
3/2
 2πmk Tc 
= 2 . 612  2  (7.78)
 h 
For T < Tc, Eq. (7.76) cannot be satisfied for α2 ≥ 0. The reason for this
difficulty is that the continuum expression in Eq, (7.75) for gi is valid provided
the population of no single level is significant. Now when T is sufficiently low,
the particles will tend to occupy the ground state with ε = 0 which is not taken
into account by the expression for gi in Eq. (7.75) (gi = 0 for ε = 0). This difficulty
can be overcome by taking the ground state into account separately and using
the continuum expression for gi in Eq. (7.75) for the states with ε > 0. This leads
to the more general expression:
Quantum Statistics 227


1 2πV (2m)3/2 ε1/2 d ε
N= + ∫δ exp [(ε + α 2 ) / kT ] – 1
exp (α 2 / kT ) − 1 h3
(7.79)
where δ is a small positive quantity.
For T > Tc the-first term is small, e.g. at high temperatures one has Maxwell-
Boltzmann distribution for which [using Eq. (7.76) without the unit term in the
denominator]
3/2
 2πmkT  V
exp (α2/kT) =  2  >> 1
 h  N
For T < Tc, α2 is small but nonzero, and the nonsingular integral in Eq. (7.79)
can be evaluated at α2 = 0. Using the variable x = ε/kT, Eqs. (7.79) and (7.78)
given
N = N0 + N (T/Tc)3/2, T ≤ Tc (7.80)
where N0 is the number of particles in the ground state. The fraction of particles
in the ground state is

N0
= 1– (T/Tc)3/2, T < Tc (7.81)
N
and is shown in Fig. 7.4 (a). For T < Tc, a significant fraction of particles is in
the ground state, and this occupation of the zero energy and zero momentum
ground state is called Bose-Einstein condensation. The temperature Tc below
which the condensation takes place is called he condensation temperature.
The particles in the ground state have zero energy and momentum, and
hence do not contribute to the viscosity of the fluid. (Viscosity arises from the
interaction between particles—viscous flow is accompanied by the excitation
of vortices whose quantum is called a roton. The roton has a finite energy and
hence cannot easily be excited at low temperatures.) These particles, being in
the ground state, do not contribute to the total energy which therefore is obtained
from the second term in Eq. (8.79) with α2 = 0, as

2πV (2m)3/2 ε3/ 2 d ε
E=
h3 ∫
δ
eε/kT − 1
(7.82)

Using x = ε/kT, this expression comes out to be

T 5/2
E = 0.77 Nk 3/2
T < Tc (7.83)
Tc
228 Elements of Modern Physics

1.0

N0/N
0.5

0
0 0.5 1.0 1.5 2.0
T/TC
(a)

3.0

2.0
CV/R

1.0

0 0.5 1.0 1.5 2.0


T/TC
(b)

Fig. 7.4 (a) Fraction of particles in the ground state, (b) specific heat
as a function of temperature. The solid line is for Bose-Einstein
condensation and the dashed line is the experimental
curve with Tc = 2.17 K.

Therefore, the specific heat is given by


3/2
T 
Cv = 1.93 R   , T < Tc (7.84)
 Tc 
which at T = Tc is greater than the classical value of 3/2 R. Detailed calculations
show that Cv is continuous at T = Tc but has a kink there, i.e. its derivative is
discontinuous at T = Tc [see Fig. 7.4 (b)].
Some examples of transitions which are possible manifestations Bose-
Einstin condensation are discussed here.
Liquid 4He: The phase transition of liquid 4He at T = 2.7 K provided an
interesting illustration of Bose-Einstein condensation. The properties of liquid
helium (4He liquefies at 4.2 K under normal conditions) show dramatic changes
Quantum Statistics 229

at 2.17 K. Above 2.17 K, it behaves like a normal liquid and is known as helium I.
Below this temperature, it acquires some unusual properties, e.g. it flows through
capillaries without any apparent viscosity. This form is known as helium II and
many of its properties can be described by regarding it as a mixture of two
fluids, one a normal fluid and the other a superfluid which has no viscosity.
This mixture is similar to a Bose-Einstein gas with some condensation, the
superfluid corresponding to the particles in the ground state. This would explain
the zero viscosity. The identification of the two phenomena is further
strengthened by the observation that the specific heat of 4He also shows a singular
behaviour at 2.17 K. The observed specific heat has the shape of λ [see Fig. 7.4
(b)] and hence the transition is called a λ-transition while the transition
temperature is called the λ-point. It should be noted however that careful
experiments indicate that the specific heat has a logarithmic infinity at the
λ-point Tλ. This however may be due to the fact that the particles considered in
Bose-Einstein condensation were noninteracting which is certainly no the case
for the atoms of liquid helium. Finally, using V = 27.6 cm3/mole for liquid
helium in Eq. (7.78), one obtains Tc = 3.13 K compared with Tλ = 2.17 K. These
observations strongly suggest that the λ-transition is a form of Bose-Einstein
condensation.
Liquid 3He: Helium has an isotope 3He which is a fermion (it has 2 protons,
1 neutron and 2 electrons) and which liquifies at 3.2 K. It is found that 3He,
though a fermion, undergoes a transition to the superfluid state at 2.6 × 10–3 K.
This arises from the fact that two 3He atoms interact with each other and produce
a weakly-bound system at low temperatures. This bound system is a boson
which can undergo a transition to the superfluid state.
Hydrogen: The atoms of about half of the elements are bosons, i.e. they
obey Bose statistics. Even then, Bose-Einstein condensation is not a common
phenomenon. The reason for this is that the condensation takes its simplest
form only for an ideal gas in which the atoms do not interact with each other. In
real atoms, the electromagnetic interaction tends to bind them and most
substances go into the solid state long before the critical temperature for Bose-
Einstein condensation is reached. Therefore, condensation is expected in only
those systems where the interaction between the atoms is weak compared to the
zero-point energy of the atoms, e.g. in helium. An interesting possibility that is
being currently considered is the Bose-Einstein condensation of atomic hydrogen
[see Silvera and Walraven, Sc. Am. 246, 1, 56 (1982)]. It is true that under
ordinary conditions, the interaction between the hydrogen atoms is quite strong
and binds them into molecules in which the spins of the two electrons are
antiparallel. However, if the atoms with parallel electron spins are isolated, for
example, by using strong inhomogeneous magnetic fields. Pauli’s exclusion
230 Elements of Modern Physics

principle prevents the electrons from having overlapping wave functions. As a


consequence, the force between these atoms is mostly repulsive and the atoms
do not bind. The experiments show that atomic hydrogen with parallel electron
spins, remains a gas at temperatures as low as 0.08 K. One may therefore be
able to observe Bose-Einstein condensation in atomic hydrogen. The critical
temperature at which the condensation takes place depends on the density and
is given by Eq. (7.78). For example, at ρ = 1024 m–3, the predicted critical
temperature is 0.016 K. While the densities of atomic hydrogen achieved in the
laboratory, are as yet not sufficiently high to observe the condensation, they are
only one or two orders of magnitude lower than those at which Bose-Einstein
condensation is predicted to occur. An observation of Bose-Einstein condensation
in atomic hydrogen would be an exciting, unambiguous demonstration of
quantum properties of a collection of bosons.
Superfluidity in neutron stars: There are interesting speculations that
superfluidity may be occurring in neutron stars. Neutron stars are thought to be
the end products of stars whose masses are between 4 Ms and 20 Ms, Ms being
the mass of the sun. They are very dense with an average density of about
1014 – 1015 g/cm3, have a radius of about 10 km, and are primarily made up of
neutrons. Under suitable conditions two neutrons, which are fermions, may
form weakly-bound states. The bound system which is a boson, may undergo
Bose condensation to become a superfluid. It may be expected that since the
densities of the neutron stars are so large, the transition temperature there would
also be very high.

Bose-Einstein Condensation
In the gas phase, the Bose-Einstein condensate (BEC) remained an unverified
theoretical prediction for many years. In 1995 the research groups of Eric Cornell
and Carl Weiman of JILA, at the University of Colorado at Boulder, produced
the first such condensate experimentally.
Condensation happens when several gas molecules come together and form
a liquid. It all happens because of loss of energy. Gases are really excited atoms.
When they lose energy, they slow down and begin to collect. They can collect
into one drop. Water condenses on the lid of a pot when water is boiled. It cools
on the metal and becomes a liquid again. One would then have a condensate.
If a sufficiently dense gas of cold atoms can be produced without
condensation into liquid state, the matter wavelengths of the particles will be of
the same order of magnitude as the distance between them. It is at that point
that the different waves of matter can ‘sense’ one another and co-ordinate their
Quantum Statistics 231

state, and this is Bose-Einstein condensation. It is sometimes said that a


“superatom” arises since the whole complex is described by one single wave
function exactly as in a single atom. We can also speak of coherent matter in
the same way as of coherent light in the case of a laser.
This was achieved with alkali atoms. For rubidium with mass number
87,87Rb, and sodium with its single stable isotope 23Na, which both have integer
atomic spin, weak repulsive forces arise between the atoms in each case. BEC
occurs if the density, expressed as the number of atoms in a λ-sided cube exceeds
2.6. The atoms for realistic densities must move very slowly, at speeds of the
order of a few millimetres per second. This corresponds to temperatures of the
order of 100 nK (nanokelvin), i.e. a tenth of a millionth of a degree above
absolute zero.

Fig. 7.5 Successive occurrence of Bose-Einstein condensation in rubidium.


From left to right is shown the atomic distribution in the cloud just prior to
condensation, at the start of condensation and after full condensation. High peaks
correspond to a large number of atoms. Silhouettes of the expanding atom cloud
were recorded 6 ms after switching off the confining forces of the atom trap.

0 Absorption max.

Fig. 7.6 Pattern of interference between two overlapping Bose-Einstein


condensates of sodium atoms. The image was made in absorption.
Matter-wave interferences have a periodicity of 15 micrometer. The recording
shows that the atoms of the two condensates were fully co-ordinated.
232 Elements of Modern Physics

Fig. 7.7 Repeated release from the trap of parts of a Bose-Einstein condensate of
sodium atoms. Pulses of coherent matter fall in the gravitational field—the
phenomenon can be seen as an atom laser effect. The real size of the
picture is 2.5 mm × 5 mm.
(Source: http://www.nobelprize.org/nobel_prizes/physics/laureates/2001/
public.html)

7.5 APPLICATIONS OF FERMI-DIRAC DISTRIBUTION


The Fermi-Dirac distribution is dominated by the property that the occupation
index si/hi [Eq. (7.38)] is less than or equal to 1 at all temperatures, since no
two fermions can occupy the same state. This provides a very useful framework
for the description of several properties of metals in terms of what are known as
conduction electrons.

Free-Electron Theory of Metals


According to this theory of metals (Pauli and Sommerfeld, 1927), the weakly-
bound valence electrons become detached from the atom and move around
freely. As a first approximation, the detailed interaction of the electrons with
the lattice points (i.e. the ions) and with each other may be neglected, band the
free electrons regarded as being in an average, constant potential in the
macroscopic volume of the metal. Since the electrons are fermions, their
distribution is given by the Fermi-Dirac distribution as
hi
si = (ε – εf ) / kT (7.85)
e +1
where εf (T) is the Fermi energy. The number of states hi is the same as in
Eq. (7.75), except for a factor of 2 to take into account the two spin states of the
electron, giving
Quantum Statistics 233

4πV (2m)3/2 1/2


hi = ε dε (7.86)
h3
The density of electrons as a function of energy, is then given by
dN (ε)  4πV (2m)3/2  ε1/2
=   (ε − ε f )/kT (7.87)
dε  h3 e +1
and is shown in Fig. (7.8). The Fermi energy is determined from the condition
that the total number of particles is N, i.e.

kT

(a)
dN/de

(b)

0
0 e ef

Fig. 7.8 Density of levels as function of ε, (a) for T = 0, (b) kT = 0.1 εf(0).


4πV (2m)3/2 ε1/2 d ε
N=
h3 ∫ exp [(ε − ε
0 f )/kT ] + 1
(7.88)

At T = 0,
εf
4πV (2m)3/2
∫ε
1/2
N= dε (7.89)
3h3 0

8πV (2m)3/2 3/2


= εf
3h3
which gives
2/3
h2  3N 
εf (0) =
  (7.90)
2m  8πV 
A slightly involved calculation gives for kT << εf (0)

  kT 
2

π2
εf(T) ≈ ε f (0) 1 −    (7.91)
 12
  ε f (0)  

234 Elements of Modern Physics

For metals, N/V ≈ 5 × 1022 cm–3 for which Eq. (7.89) implies εf = 4.5 eV.
The actual value of εf (0) for some of the metals is 4.7 eV for Li, 2.1 eV for K,
7.0 eV for Cu, and 5.5 eV for Au. This means that the approximation in
Eq. (7.91) is adequate for most purposes (kT≈ 0.026 eV at T = 300 K). For
kT<<εf most of the electrons are in the lowest energy states allowed by Pauli’s
exclusion principle, and the electron gas is said to be degenerate (completely
degenerate at T = 0). It is interesting to note that because of the exclusion
principle, the average energy of the electron gas is quite substantial even at T = 0:
εf

∫ εε
1/2

0
ε (0) = εf

∫ε
1/2
de
0

3
εf
= (7.92)
5
which is of the order of a few eV (compare with kT ≈ 0.0226 eV at room
temperature).
Specific heat of meals: An interesting property of the specific heat of metals
is that it is described quite well by the Debye theory. Since the Debye theory
includes only the phonon contributions, i.e. lattice vibrations, this implies that
the contribution from the free electrons to the specific heat of metals is small.
This is explained by the fact that, unlike the phonons, the free electrons satisfy
Fermi-Dirac statistics. When the temperature T is increased, only a few electrons
in the range |ε – εf | ≈ kT are excited to the higher energy states (see Fig. 7.8). It
is only these electrons that contribute to the specific heat, as a result of which
the contribution of the electron gas to the specific heat is quite small. Roughly
speaking, it is seen from Fig. (7.8) that the number of electrons which are excited
dN
is kT and their energy increases by an amount of about 2kT. Therefore,
dε ε =εf

the total energy of the systems is given by


dN
E(T) ≈ E (0) + 2k 2 T 2 (7.93)
de ε=εf

and the heat capacity per mole of the electron gas is


dN
Cve1 ≈ 4k T
2
(7.94)
dε ε =εf
Quantum Statistics 235

 kT 
≈ 3R  
 εf
 
Here, we have used Eq. (7.87) for dN/dε and Eq. (7.89). A more detailed
calculation gives

π2  kT 
Cve1 = R  (7.95)
2  ε f 
Since kT/εf is quite small at ordinary temperatures, the electronic specific
heat also is small and the total specific heat is described quite well by the Debye
theory. It should, however, be noted that the Debye specific heat at low
temperatures is proportional to R(T/θ)3 [see Eq. (7.66)] so that at sufficiently
low temperatures the electronic specific heat becomes dominant. At low
temperatures, the total specific heat is given by

12 4 3 π2  kT 
Cv = π R (T / θ) + R  (7.96)
5 2  ε f 
and the observed nonzero limit of Cv/T as T → 0, for metals such as copper,
indicates the presence of the linear electronic contribution. Experimentally, in
the case of copper, Cv/T for T → 0 is about 0.7 × 10–3 J/mol/K2 whereas the value
predicted for copper (εf ≈ 7 eV), by Eq. (7.96), is about 0.54 × 10–3 J/mol/K2 . The
difference is a measure of the deviation of the model from the real situation.
Electrical and thermal conductivities: Some general characteristics of the
electrical and thermal conductivities of metals can be discussed in terms of the
free-electron theory of metals. This discussion will be based on the assumptions
that (i) the conducting electrons move with the velocity vf = (2εf /m)1/2 , which is
reasonable since most of the conducting electrons will be in states close to the
Fermi level, (ii) the electrons have a mean free path of λ and that they carry
information over a distance of λ (λ ≈ 500 Å) .
In the presence of an external electric field E, the electrons acquire an average
drift velocity v which is equal to half of the average acceleration εE/m multiplied
by the interval λ/v f between two collisions. Therefore, the current is
1
en (eλE/mvf) where n is the electron density. This satisfies Ohm’s law since
2
vf, being large, is essentially independent of E. The electrical conductivity is
then
σ = e2nλ/2mvf (7.97)
For calculating thermal conductivity, it is noted that since the electrons
carry information over a distance of λ, the energy carried across an area by the
236 Elements of Modern Physics

 ∂ε  1
electrons is ε ± (1/ 2)λ   in opposite directions. Therefore, if n electrons
 ∂x  3
are assumed to have a velocity perpendicular to the area, the net energy
transferred across a unit area, per unit time, is
dQ 1  ∂ε 
= − nv f  λ  (7.98)
dt 6  ∂x 
(where the negative sign indicates that the energy is transferred in a direction
opposite to the gradient). From this relation, the thermal conductivity is obtained
by writing ∂ε/∂x as (∂ε/∂T) (∂T/∂x) which leads to the coefficient of thermal
conductivity K,
1 ∂ε
K= nv f λ (7.99)
6 ∂T
Since ∂ε/2T is the specific heat per electron, using Eq. (7.95) gives
π2 nk 2 λT
K= (7.100)
6m v f
It follows from Eqs. (7.97) and (7.100) that
K
=L
σT
2
π2  k 
=   (7.101)
3 e
which is the same for all metals. This relation is known as Wiedemann-Franz
law. The constant L, know as Lorenz number, has a value of 2.45 × 10–8 JΩ/s K,
while the experimental values of K/σT for some of the metals at 0°C are 2.31 ×
10–8 for Ag, 2.47 × 10–8 for Pb and 2.19 × 10–8 for Na.
While it is obvious that the free electrons are responsible for transporting
charge, it is suggested by the validity of the Wiedemann-Franz law that the free
electrons play a dominant role in the transfer of energy as well, in preference to
the phonons. It is also noted that the thermal conductivity of metals is in general
greater than that of insulators, sometimes by as much as two orders of magnitude.
It is therefore reasonable to say that most of thermal conductivity in meals is
due to the free electron gas.
Thermionic emission: When a metal is heated, electrons are emitted from
the surface. Thermionic emisson can be studied by subjecting the electrons to a
small potential difference and analysing the thermionic emission current as a
function of temperature.
The electrons in a metal may be regarded as particles in a potential well
with barrier at the boundary. The barrier arises from the fact that when an electron
tries to escape from the surface, its image in the surface, being of opposite
Quantum Statistics 237

charge, pulls it back. In order to escape, the electrons must then have a minimum
energy φ above the Fermi level. This energy φ is known as the work function,
and usually has a value of the order of a few eV, e.g. 2.3 eV for Na, about 4.5 eV
for Cu, etc.
An electron that is emitted must satisfy the condition (the metal surface is
taken to be perpendicular to the z-direction)
pz2
≥ εf + φ (7.102)
2m
Since the current at a point is v ρ, ρ being the charge density, the amount of
charge emitted by a unit area, per unit time is


j = e vz dN (7.103)
Here dN is the number of electrons per unit volume, with momentum
between p and p + dp. It is equal to si /V, where si is given in Eq. (7.85) and
hl in Eq. (7.86). Using the relation p2 = 2mε, 2π (2m)3/2 ε1/2 dε is replaced in hi by
4π p2 dp or dpx dpy dpz. It is then integrated over only positive pz to give
∞ ∞ ∞
8e p  1
j= ∫
h3 0
dpx ∫ dp y ∫ dp z  z 
 m  exp [(ε − ε f )/kT ] + 1
0 [2 m ( ε1 + φ )1/2
(7.104)
Since φ is generally of the order of a few eV, the 1 in the denominator can
be ignored to get
∞` ∞
∞`
8e – py 2 / 2 mkT
j= 3
h ∫ dp e x
– px 2 / 2 mkT
∫ dp y e ∫
0
0 [2 m ( ε f + φ )1/2

p    p2  
dp z  z  exp  −  z − ε f  kT 
m   2m  


= 3
me k 2T 2 e – φ / kT (7.105)
h
This is known as the Richardson-Dushman equation and is generally written
as
j = AT2 exp ( – φ/kT) (7.106)
where A has the value 1.2 × 10 A/m /K . It is in good agreement with the
6 2 2

experiments provided (i) the constant A is modified to take into account the
possibility that the electron may be reflected when it comes across a change in
the potential near the surface, (ii) φ varies with temperature, with the crystal
direction and with surface impurities. The experimental values of A are usually
238 Elements of Modern Physics

though not always, smaller than the one predicted by Eq. (7.105), e.g. 0.4 × 106
for Cr, 0.30 × 106 for Ni, etc. in MKS units

7.6 SUPERCONDUCTIVITY
Superconductivity is an interesting phenomenon in which electrons, which are
fermions, behave like bosons. The reason for this is that under some special
conditions, pairs of electrons form weakly-bound states which exhibit properties
of Bose systems.
When the temperature of some metals, semiconductors and alloys is lowered
to a few degrees kelvin, the electrical resistance of the material suddenly drops
to zero [see Fig. 7.9 (a)]. The substance is then said to have become a
superconductor and the temperature Tc at which the transition takes place is
known as the critical or transition temperature, e.g. Tc = 0.015 K for tungsten,
3.72 K for tin, 9.3 K for niobium, and the highest known value 23.2 K for the
Nb3 Ge alloy. The transition to a superconducting state is quite sharp for a pure
and physically perfect specimen. In some cases it has been observed to occur
within a temperature range of 10–5 K. However, for impure or physically imperfect
specimens, the transition may be over a range as large as 0.1 K or more.
Superconductivity has not been observed in all substances. In particular, it
has not been detected in alkali metals, ferromagnetic substances, and relatively
good conductors of electricity such as Ag, Cu, Au. Matthias has pointed out
that superconductivity occurs only in substances which have an average of two
to eight valence electrons per atom. Also, a small atomic volume is favourable
for superconductivity. It is worth noting that an alloy may be a superconductor
even if it is composed of two metals which themselves are not superconductors,
e.g. Bi-Pd.
Some of the important properties of superconductors are the following:
1. The current in the superconductors persists for a very long time. This is
demonstrated by placing a loop of the superconductor in a magnetic
field, lowering its temperature below Tc and then removing the field.
The current which is set up is found to persist over a period longer than
two years without any attenuation.
2. The magnetic field does not penetrate into the body of the superconductor
(permeability µ = 0). This property, known as the Meissner effect, is the
fundamental characterization of superconductivity. However, when the
magnetic field B is greater than a critical value Bc (T) [see Fig. 7.9 (b)],
the superconductor becomes a normal conductor [Bc (T) is zero at T = Tc
and has the largest value at T = 0].
Quantum Statistics 239

Resistivity (ohm – m × 10 )
10
2

TC
0
0 2 4 6 8
T(K)
(a)
.09

.06
BC(tesla)

Normal
.03
Super-
conducting

0
0 2 4 6 8
T(K)
(b)

.06
C(Joule/mole/K)

.04
tin
uc
nd
rco
pe

.02
al
Su

rm
No

0
0 2 4 6 8
T(K)
(c)

Fig. 7.9 (a) Resistivity of a superconductor as a function of T, (b) the critical


magnetic field at which superconductivity disappears, (c) specific heat
in superconducting and normal states.

3. When the current through the superconductor is increased beyond a


critical value Ic(T), the superconductor again becomes a normal conductor
[Ic(T) = 0 at T = Tc ].
4. The specific heat of the material shows an abrupt change at T = Tc ,
jumping to a larger value for T < Tc [see Fig. 7.9 (c)].
240 Elements of Modern Physics

The theory of superconductivity was given by Barden, Cooper and


Schrieffer, and is known as the BCS theory. In this theory the electrons experience
a special kind of mutual attraction which at large distances dominates over the
Coulomb repulsion between them. It is the lattice of the material which provides
the necessary medium for producing the attractive forces. An electron moving
in the metal disturbs the lattice, producing phonons (quanta of vibrational
motion). These phonons may be absorbed by another electron which may, at
the microscopic level, be far away (about 10–6 m) from the first electron. In
effect, the two electrons interact and the electron-electron interaction energy
due to a phonon exchange is negative. If at low temperatures this attraction
exceeds the Coulombic repulsion, the two electrons form a weakly bound state
called a Cooper pair (a Cooper pair has an energy of about 10–3 eV in
superconductors). If such pairs are created, the conductor becomes a
superconductor.
Cooper pairs are spin-zero bosons and hence can be in the same state. They
are in the ground state and are described by a wave function which extends over
the entire body of the metal. In a sense, this is quantum mechanics on a
macroscopic scale.

The Energy Gap


The specific heat of a superconductor at very low temperatures, is observed to
be of the form
3
T  – b/kT
Cv = A   + ae (7.107)
θ
where the first term is the lattice specific heat [Eq. (7.67)]. The second term is
the electronic contribution to specific heat. The exponential form of this term
suggests the presence of an energy gap [see Eq. (7.82) which indicates that if
there is an energy gap ∆, the leading behaviour of the total energy of bosons at
low temperatures is given by E ~ e–∆/kT]. The energy gap comes from the binding
energy of Cooper pairs. Since energy is required to break the bond in Cooper
pairs, a sharp jump in the specific heat is observed at the transition temperature.
Furthermore, when T~ Tc, there is a substantial number of electrons in the normal
state so that the energy gap is
∆ ~ kTc (7.108)
The detailed BCS theory gives the binding energy of the Cooper pair as
Eb(T) ≈ 3.5 kTc for T → 0
→ 0 for T → Tc (7.109)
and this is the energy gap between the ground state and the dissociated state.
For Tc = 4K, and energy gap of the order of 3 × 10–4 eV is obtained. The smallness
Quantum Statistics 241

of the energy gap is the reason why superconductivity is a low-temperature


phenomenon.
The attenuation of a current implies a change in the state of conducting
electrons. Since the ground state is separated by the energy gap ∆, the excitation
of the Cooper pairs is not possible at low velocity [corresponding to currents
less than I c(T)]. This implies motion without friction and therefore,
superconductivity of the metal (or alloy).

Magnetic Properties
The magnetic properties of superconductors are quite complicated. In the class
of superconductors known as type I superconductors (which includes most of
the elemental superconductors), the magnetic field is excluded from the body
of the superconductors for B < Bc (T) showing perfect Meissner effect. However,
the Meissner effect disappears for B > Bc. (T).
For type II superconductors, an example of which is lead-indium alloy,
perfect Meissner effect occurs for B < B1 (T), but only a partial exclusion of the
field for B1 (T) < B < B2 (T), and a complete penetration of the field for B > B2
(T). The reason for this behaviour is that for B (T) between B1 (T) and B2 (T), the
material is in a mixed state. A close investigation of the specimen shows the
presence of small circular regions in the normal state, called vortices or fluxoids.
They are surrounded by large regions which are in the superconducting state. It
is the presence of both the states which gives rise to partial penetration of the
field. Materials with high critical temperatures tend to fall in the class of the
type II superconductors.
Since usually B2 (T) >> Bc (T), carefully-prepared type II superconductors
are used for the manufacture of high-field magnets which require almost no
power input and little cooling. Technology based on superconductors would
receive a major boost if superconductivity could be produced at higher
temperatures, say at liquid nitrogen temperature (Tb = 77.4 K). This possibility
has been considered recently.
There are two additional properties which are of interest, quantization of
flux enclosed by a superconductor and Josephson junctions, which are discussed
briefly.

Quantization of Flux
Consider a superconducting loop in which a current is circulating. The current
generates a magnetic field whose flux across the area enclosed by the loop is
quantized.
242 Elements of Modern Physics

The wave function of the superconducting particles satisfies the equation


[see Eq. (6.1)].

 1 2 
 2m (– i∇ − qA) + V  ψ = Eψ (7.110)
 
with q being the charge of the particles, and is given by

 q r 
ψ(r) = exp i
  ro ∫
A. d l  φ(r )

(7.111)
 
where the integral involved is a line integral and φ(r) satisfies Eq. (7.110) in the
absence of the field, i.e. for A = 0. Now, the wave function must have the same
phase even after going around the entire loop, i.e.

q
 z
A . d l = 2nπ, n = 0, ± 1, ± 2,... (7.112)

where the integration is along the entire loop. Using Stokes theorem

q
 z A.d l =
q
 ∫
∇ × A. ds

= 2nπ, n = 0, ± 1, ± 2,... (7.113)


where the surface integral is across the surface enclosed by the loop. Since
∇ × A = B, this leads to the relation for flux φ,

φ= ∫ B.d S
h
= n, n = 0, ± 1, ± 2,... (7.114)
q
Thus the flux takes only quantum values of integral multiples of h/q. The
value of h/e = 4 × 10–15 Wb is quite small but is macroscopically detectable. The
quantized flux was observed experimentally by Deaver and Fairbank and
independently by Doll and Näbauer (1961). The observed flux was found to be
integral multiples of h/q with q = –2e. This is an additional confirmation of the
BCS theory according to which it is the Cooper pairs, with charge –2e each,
that are the carriers of current in superconductors.

Josephson Junctions
The discovery of Josephson junctions has made the direct macroscopic
measurement of the ratio /e possible.
Quantum Statistics 243

Consider two pieces of a superconductor separated by a thin layer of an


insulator (about 20 Å in thickness). If now a voltage V is applied to the
superconductors, an ac current j flows across the junction. This effect, known
as Josephson effect, is due to the tunnelling of the superconducting electrons
across the insulator. In the process they emit microwave radiation of angular
frequency ω such that
ω = | q | V (7.115)
where q = – 2e is the charge of a Cooper pair. It was also found that when the
junction was illuminated by a radiation of frequency ω, and the potential V was
varied, the current across the junction showed a jump whenever the condition
nω = 2e V, n = an integer, (7.116)
was satisfied. Thus knowing ω and V, it was possible to obtain an accurate
measurement of /e.
For obtaining the current across the junction, let ψ1 and ψ2 be the solutions
of the Schrödinger equation for particles on the two sides of the insulator, with
Hamiltonian H0 in the absence of an external radiation. The approximate form
of these wave functions in the region of the junction, is
i
ψ1 = exp [– (x + a)k1] exp (– Et ) (7.117)

i
ψ2 = exp[– (a – x) k2] exp [– ( E + qV )t ] (7.118)

where V is the potential across the junction (see Fig. 7.7). If now radiation of
frequency ω is incident on the junction, it may be assumed that a superposition
of ψl and ψ2,
ψ = b1 (t) ψ1 + b2 (t) ψ2 (7.119)
is an approximate solution of the Schrödinger
∂ψ
i = ( H0 + H1 ) ψ (7.120)
∂t
Here H1 is the interaction with the external radiation which is taken to be v
cos ωt at one end (say the first) of the insulator and zero at the other. Multiplying
Eq. (7.120) successively by ψ1 and y2 and integrating, one obtains (after
neglecting the overlapping terms).

Superconductor Insulator Superconductor

y1 y2

a x=0 a
Fig. 7.10 The Josephson junction and the associated wave functions.
244 Elements of Modern Physics

∂b1 (t )
i = v (cos ωt) b1 (t) (7.121)
∂t
∂b (t )
i 2 =0 (7.122)
∂t
These equations are fairly easy to solve. However, it is more instructive to
solve them perturbatively. Assuming that v is small, we replace the b1 (t) on the
right hand side by b1 (0) and integrate the two sides to get

i
b1 (t) ≈ b1 (0) – v (sin ωt ) b1 (0) (7.123)

b2 (t) = b2 (0) (7.124)
Now the quantum mechanical generalization of current is

q
j = Re ψ*p ψ
m

 i q 
= – Re  ψ*∇ ψ (7.125)
 m 
Substituting Eq. (7.119) for ψ with b1 (t) and b2 (t) given by Eqs. (7.123)
and (7.124), the current across the junction is

  qVt  v  qVt 
j ≈ j0 sin  + δ0   − (sin ωt ) cos  + δ0  
     ω   
(7.126)
where j0 and δ0 are constants. It is interesting to note that the ac current persists
even in the absence of external radiation. On taking the time average, the
contribution of the second term is nonzero if

| qV |
=ω (7.127)

If higher order perturbations are included, there are nonzero contributions
to the current, for higher harmonics as well,

| qV |
= nω, n = 1, 2,... (7.128)

in conformity with the result in Eq. (7.116) for q = – 2e. For n = 1, v = 4.836 ×
1011 Vs–1 where V is in millivolts. Since V is usually of the order or several
millivolts, the Josephson frequency is in the microwave range.
One of the most important applications of the Josephson effect is the
determination of the fundamental constant e/ which occurs in Eq. (7.128) with
Quantum Statistics 245

q = –2e. It has been possible to determine this ratio to an accuracy of a few


parts in 106 . It is to be noted that Josephson effect is very sensitive to the
magnetic field. The use of this property has led to many important applications
of Josephson junctions, singly or in combinations.
Around 1986 Georg Bednorz and Karl Müller, working at IBM in Zurich,
discovered that certain semiconducting oxides became superconducting at 35 K,
then considered a relatively high temperature. In particular, the lanthanum barium
copper oxides, an oxygen deficient perovskite-related material, proved
promising.
Later on, Maw-Kuen Wu and his graduate students, Ashburn and in 1987,
and Paul Chu and his students around same time discovered YBCO has a Tc of
93 K. (The first samples were Y1.2Ba08CuO4.) Their work led to a rapid succession
of new high temperature superconducting materials, ushering in a new era in
material science and chemistry.
YBCO was the first material to become superconducting above 77 K, the
boiling point of liquid nitrogen. All materials developed before 1986 became
superconducting only at temperatures near the boiling points of liquid helium
(Th = 4.2 K) or liquid hydrogen (Tb = 20.28 K) — the highest being Nb3Ge at
23 K. The significance of the discovery of YBCO is the much lower cost of
refrigerant used to cool the material to below the critical temperature.

Copper Ribbons

Ba

Copper Planes

Copper Planes

Ba
Copper
Oxygen

Fig. 7.11 Perovskite structure of Y1Ba2Cu307/Y1Ba2Cu4Os.


246 Elements of Modern Physics

There is no widely accepted theory to explain their properties. Cuprate


superconductors (and other unconventional superconductors) differ in many
important ways from conventional superconductors, such as elemental mercury
or lead, which are adequately explained by the BCS theory. There also has been
much debate as to high-temperature superconductivity coexisting with magnetic
ordering in YBCO, iron-based superconductors, several other exotic
superconductors, and the search continues for other families of materials. HTS
are Type-II superconductors, which allow magnetic fields to penetrate their
interior in quantized units of flux, meaning that much higher magnetic fields
are required to suppress superconductivity. The layered structure also gives a
directional dependence to the magnetic field response.
Several commercial applications of high temperature superconducting
materials have been realized. For example, superconducting materials are finding
use as magnets in magnetic resonance imaging, magnetic levitation, and
Josephson junctions. (The most used material for power cables and magnets is
BSCCO (bismuth strontium calcium copper oxide).
(Source: Wikipedia)

7.7 EXAMPLES
In this section, some examples which illustrate and extend the main ideas
quantum statistics are discussed.

Example 1
Consider the statistical distributions of two identical particles among three sets
of states g1= 1, g2 = 2, g3 = 1 with energies 0, ε, 2ε, respectively (both the g2
states have energy ε). The populations of these sets are (n1, n2, n3). The most
probable distribution with total energy 2ε has to be found.
Distinguishable particles: The allowed population distributions are:
1. (1, 0, 1) has two distinguishable arrangements (A, 0, B), (B, 0, A)
2. (0, 2, 0) with four possible distinguishable arrangements (AB, 0), (0, A, B),
(A, B) and (B, A) in the g2 set, where A and B represent the two
distinguishable particles.
Thus, the second distribution is twice as probable as the first distribution.
Bosons: For bosons, the (1, 0, 1) distribution has only one distinguishable
arrangement while (0, 2, 0) has three distinguishable arrangements (AA, 0),
(A, A) (0, AA) in the g2 set. Therefore, the (0, 2, 0) is three times as probable as
the (1, 0, 1) distribution.
Fermions: For fermions, of the two distribution (1, 0, 1) and (0, 2, 0), each
has only one possible distinguishable arrangement (it should be recalled that
Quantum Statistics 247

the Pauli principle forbids more then one particle in each state). Thus, the two
distributions are equally probable.
The number of distributions can be verified by Eqs. (7.4), (7.7) and (7.10).

Example 2
The extension of the classical distributions for the case of bound and ionized
atoms in equilibrium is of special interest in astrophysics and plasma physics.
The equilibrium distribution is obtained by using arguments similar to those
used in Sec. 6.4 for obtaining the Einstein coefficients A and B.
The transitions in this case are
 M + + e
M0  (7.129)
where M0 is the neutral atom and M is its ion. In contrast to the discussion in
+

Sec. 6.4, here the final states form a continuum. Let N0 be the number of M0
atoms and N+ be the number of M+ ions. The number of ionization transitions to
a set of states fi, is obtained from Eq. (6.60) as
Nmn = Bmn u(ω) N0 fi
(2m)3/2 V 1/2
fi = ε dε (7.130)
4π2 3
where fi is given in Eq. (7.29) and ω = ε + EI, EI being the ionization energy
and ε is the energy of the electron. The number of reverse reactions, i.e.
recombinations, is given by the first equation in Eq. (6.60) except that now the
expression is also proportional to the number dNe of electrons in fi states
Nmn = ( Bnm u (ω) + Anm ) N+ dNe| (7.131)
Equating Nmn and Nnm, gives for u (ω)
Anm
u (ω) = (7.132)
N0
Bmn f i − Bnm
N + dN e
In analogy with Eq. (6.66) Bnm is taken to be equal to Bnm (this can be justified
by more rigorous arguments). Comparing u (ω) with the expression in Eq. (66.5)
gives

N +dN e
= exp [– (ε + EI)/kT] fi (7.133)
N0
Substituting for fi and integrating over dNe and ε, finally gives
3/2
n + ne  2πmkT 
n0
=  2  exp (– EI / kT ) (7.134)
 h 
where n0 = N0/V, etc. This equation is known as the Saha equation (1920).
248 Elements of Modern Physics

In deriving this relation, the degeneracy of states has been ignored. The
degeneracy can be incorporated by multiplying the right-hand side by g + ge/g0
where g is the degeneracy of the appropriate state, in particular, ge = 2
corresponding to the two spin states of the electron. As an illustration it is noted
that if M0 is the hydrogen atom in the ground state and M+ is the proton, then
g+ = 2 and g0 = 4 so that the degeneracy factor is 1. The Saha equations is very
useful in plasma physics and also in astrophysics.

Example 3
As an applications of Maxwell-Boltzmann statistics, consider the ratio of para-
hydrogen to ortho-hydrogen ordinary hydrogen at the room temperature. Since
ortho-hydrogen has I = 1,

H ( para)
∑ (2 J + 1) exp [− aJ ( J + 1)]
J = 0,2 ...
= (7.135)
H (ortho) 3 ∑ (2 J + 1) exp [− aJ ( J + 1)]
j = 1, 3

where a = 0.00755/kT, kT being in eV. In evaluating the sum, the first term is
separated out and the remaining sum is converted into an integral by replacing
J by 2l and taking x = l(2l +1). Therefore,

1 + ∫ e –2 ax dx
H ( para)
≈ 1

H (ortho)
3 [3e –2 ax
+ ∫ e −2 ax dx]
3

2a + e −2 a
= (7.136)
3(6ae −2 a + e −6 a )
At T = 27°C, a = 2.93 and the ratio comes out to be 0.33 which is in very
good agreement with experimental observations.

Example 4
Copper has an atomic weight of 63.5, a density of 8.9 g/cc, and vt = 2.32 × 103
m/s and vl = 4.76 × 103 m/s. Its Debye temperature is
θ = hvm/k
1/3
 9N0 2 1 
vm =   3 + 3  (7.137)
 4πV  vt vl  
Quantum Statistics 249

Since N0 = 6.02 × 1023/g mol, and V = (63.5/8.9) cc,


vm = 7.1 × 1012 s–1 (7.138)
and hense θ = 341 K which is in good agreement with the experimental value of
343 K.
From the Debye temperature one can estimate the specific heat at low
temperatures from Eq. (7.66). For example, at T = 30 K

T
Cv ≈ 0 .16 R,  ≈ 0.088  (7.138a)
θ 

Example 5
For estimating the transition temperature Tc for 4He, if V = 27.6 cm3/mole

N
= 2.18 × 1028 m–3 (7.139)
V
On substituting this in Eq. (7.78)
Tc = 3.13 K (7.140)
It may also be noted that α2 is a small quantity for T < Tc. Using N0 given in
Eq. (7.81) gives
1
≈ N [1 – (T/Tc)3/2], T< Tc (7.141)
exp (α 2 /kT ) − 1
which, on the expanding the exponential functions gives
kT
α2 ≈ , T < Tc (7.142)
N [1 – (T / Tc )3/2 ]
Thus α2 is very small for T < Tc except when T is close to Tc.

Example 6
The electronic properties of Cu may be deduced by assuming that each atom
contributes one free electron. The atomic weight of Cu is 63.54 and its density
is 8.96 g/cc so that
N
≈ 87.44 × 1028 m–3g
V
From Eq. (7.90), the Fermi energy at 0 K is
εf (0) ≈ 7.0 eV (7.143)
The change in the Fermi energy T increases [see Eq. (7.91)] from 0 K to
300 K is very small, about – 7.8 10–5 eV and hence εf (T) can for most purposes
be taken to be a constant.
250 Elements of Modern Physics

The total specific heat of Cu at low temperatures, including the electronic


and phonon contributions is [from Eq. (7.96)]
3
 kT   T 
 + 2.34 × 10 R 
2
Cv ≈ 4.935 R   (7.144)
 7   341 
where T is in kelvin and kT is in eV. The two contributions become comparable
at T ≈ 3.21. One may be therefore except that the linear term will be important
for T <
~ 3 K.
The mean free path λ can be estimated from Eq. (7.97). The conductivity
for Cu is 5.82 × 107/Ω m, and vf = (2εf/m)1/2 is about 1.57 × 106 m/s. One then
obtains
λ ≈ 770 Å (7.145)
which is in reasonable agreement with the experimentally measured value of
about 530 Å.

Example 7
A very interesting application of the Fermi-Dirac distribution is to white dwarfs
and neutron stars, regarded as a degenerate gas of electrons and neutrons
respectively. A very sketchy and approximate discussion of the main ideas is
given here.
When a star contracts, a part of its gravitational energy escapes as radiation
but the remainder is retained as kinetic energy. At equilibrium, there is the
approximate relation

GM 2
≈ N εf (7.146)
R
where G is the gravitational constant, M is the mass of the star, R is its radius,
N is the number of particles and εf is the Fermi kinetic energy of the particles
which is of the same order of magnitude as the average kinetic energy. For the
highly degenerate fermions, relativistic kinematics should be used and
εf = (pf 2c2 + m2 c4)1/2 – mc2 (7.147)
where m is the mass of the degenerate particles. For obtaining pf, Eq. (7.89) is
written in the form
8πV
N= pf 3 (7.148)
3h3
It is interesting to note that while the validity of Eq. (7.89) is limited to
nonrelativistic situations, Eq. (7.148) is valid even for large velocities substituting
these relations in Eq. (7.146), gives
Quantum Statistics 251

1/2
GM 2  3 Nh3  2/3 
= N   c 2
+ m 2 4
c  − Nmc 2 (7.149)
R  8πV  

Furthermore, since N = M/mN, mN being the mass of the neutron or the


4π R 2
proton, and V = , this equation simplifies to
3
2/3
G 2 mN2 M 2 2GmN M mc 2  9 Mh c 
3 3
1
+ =   (7.150)
 32π mN 
2
R2 R R2
Since R is positive, this implies the condition
2/3
 9 Mh3 c 3 
  > G2 mN2M2 (7.151)
 32π mN 
2

3/2
2  1   he 
or M<    (7.152)
2π  mN2   2G 
< 5 Msun
These calculations are only order of magnitude calculations. More refined
calculations provide a somewhat lower upper bound, < 3Msun.
This example demonstrates the importance of quantum distributions even
on an astronomical scale.

Example 8
As noted before, when the temperature of a material is lowered, the specific
heat shows a sudden increase when it becomes a superconductor at T = Tc. This
is because Cooper pairs are formed below T = Tc and some energy goes into
breaking them. However, the specific heat of the superconducting material goes
to zero faster than T as T → 0, unlike the linear T behaviour expected for a free-
electron gas, the reason being that the specific heat of the Cooper pairs goes to
zero faster than T as T → 0.

PROBLEMS
1. Three identical particles with total energy 6ε are distributed among four
energy levels with energies ε, 2ε, 3ε and 4ε of which the second level
has a degeneracy of 3. What are the possible distributions if the particles
are (i) distinguishable, (ii) bosons and (iii) fermions? Which is the most
probable distribution in each case?
252 Elements of Modern Physics

2. For a gas at temperature T, what is the most probable energy of a particle?


What is the approximate fraction of particles which have more then ten
times the average energy of the particles?
3. For producing Balmer absorption lines, a substantial number of the
hydrogen atoms should be in the first excited state. What is the percentage
of the hydrogen atoms which are in the first excited state in the sun’s
photosphere which is at a temperature of about 6000 K, taking into
account the degeneracy of the states?
4. What is the fraction of the hydrogen molecules which are in the first
excited vibrational state at 3000 K? What is the fraction in the second
excited vibrational state?
5. Show that for T >> TD = θ, the total energy of a solid is given by

 T 3 1  T −1 
E = 3RTD  − +   + …
 TD 8 20  TD  

and the corresponding specific heat by

 1  TD 
2

Cv = 3R 1 −   + ...
 20  T  
The approximation is quite good even for T ~ TD. Estimate the specific
heat of copper at T = 300 K (TD for copper is 343 K).
6. Obtain an expression for the energy of a 2-dimensional lattice. What is
the specific heat of this lattice for T → ∞ and for T → 0? The situation
is applicable for layer structures such as graphite whose specific heat at
low temperatures is proportional to T2.
7. For Cu, the lattice specific heat a low temperature has the behaviour for
Cv ~ 4.6 × 10–5 T3 J/mol K. Estimate the Debye temperature for Cu.
8. Given that the Debye temperature for diamond is 1860 K, what is the
specific heat of diamond at room temperature?
9. What is the number of excited phonons and the average energy per phonon
2
at a given temperature? Show that the average goes to hv for T → ∞ and is
3 max
proportional to T for T → 0.
10. Show that the average kinetic energy of the electrons emitted in thermionic
emission is 2kT, and that the average value of the square of velocity
perpendicular to the surface is 2kT/m.
11. Calculate the Fermi energy at 0 K, of silver (density 10.5 g/cc, atomic
weight ≈ 107.87) and sodium (density 0.97 g/cc, atomic weight ≈ 22.99),
Quantum Statistics 253

assuming one free electron per atom. What is the ratio of εf (T)/εf (0) for
these elements at T = 300 K?
12. Show that the electronic specific heat of Cu at room temperature is very
small compared to its lattice specific heat.
13. If electrons are treated as distinguishable particles, at what temperature
would they have an average energy of 5.5 eV (i.e. the Fermi energy of
silver)?
14. Given that the Fermi energy of Cu is 7.0 eV at room temperature, what is
the number of electrons per unit volume with energy greater than
8.0 eV?
15. Given that the electrical conductivity of aluminium is 3.55 × 107 Ω–1 m–1
estimate its thermal conductivity at room temperature.
16. What is the minimum frequency of radiation which can break apart
Cooper pairs in niobium?
17. A microwave radiation of frequency 1010 Hz is incident on a Josephson
junction. What is the minimum voltage across the junction for which a
jump in the current is observed?
8
Solid State Physics

Structures of the Chapter


8.1 Binding forces in solids
8.2 Crystal structures
8.3 Band theory of solids
8.4 Semiconductors
8.5 Semiconductor devices
8.6 Magnetic properties
8.7 Dielectric properties
8.8 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 255
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_8
256 Elements of Modern Physics

The description of the properties of solids is an important application of the


quantum principles to a collection of a large number of particles arranged in
order. Many of the solids are in a crystalline form, associated with a lattice. A
lattice is obtained by an infinite, regular repetition, in space, of basic units. In a
perfect crystal, the atoms are arranged according to a pattern in each unit, and
the pattern is repeated for all the units. Our concern will be mainly with such
crystals, e.g., sodium chloride, diamond, etc. This leaves out some important
substances, such as amorphous solids which have some short-range order but
no long-range order (e.g., glass, amorphous semiconductors etc.), and others
like wood, plastics, etc. which have huge molecules. Furthermore, most
crystalline materials are usually a collection of small crystals packed together,
and even the single crystals may have some disorder introduced by an impurity
or a missing atom. These disorders play an important role in determining their
properties.
The forces which bind the atoms in solids, and the structure of solids are
considered first. The energy levels in solids are significant and often decisive in
determining their electromagnetic properties, and lead to important applications.
After considering the band structure of energy levels in solids, in particular,
semiconductors, some aspects of magnetic and dielectric properties of solids
are discussed.

8.1 BINDING FORCES IN SOLIDS


The binding forces between the atoms in a solid are electromagnetic in origin.
It is the valence electrons of the constituent atoms and the quantum effects
which are important in determining the details of these forces. The binding
forces are similar to those in molecules though the many-particle effects bring
in some new aspects. We briefly discuss the different types of bonds in solids.
In a real solid, more than one of these bonds may contribute to the binding of
the atoms.

Ionic Bonds
As in the case of molecules, these bonds are formed when it is energetically
favourable for a valence electron to be transferred from one atom to another
resulting in a net electrostatic attraction between the ions. However, in the solid
there are many more ions and the electrostatic interaction of an ion with all the
other ions should be taken into account. This is done by writing the electrostatic
energy per ion pair as
αe 2
Vel = − (8.1)
4πε 0 R
Solid State Physics 257

where R is the distance between the nearest neighbours and α is called the
Madelung constant. For example, in the case of NaCl, KCl, etc. (but not CsCl),
the crystal structure is what is called face centred cubic structure (Fig. 8.1) with
an ion pair associated with each lattice point. The electrostatic energy per ion
pair is obtained by writing it as a sum of terms representing the interaction of a
given ion with the nearest neighbours, the next nearest neighbours, etc.

Fig. 8.1 The KCl crystal structure (fcc) with K+ denoted by circles and Cl– by crosses.

e2
Vel = − (6 − 12/21/ 2 + 8/31/ 2 − 3 + ...)
4πε 0 R

e2
≈ − (1.75) (8.2)
4πε 0 R
where R is the distance between the nearest neighbours and the Madelung
constant turns out to be α ≈ 1.75.
To get an idea about the details of the ionic bonds, consider solid KCl. As
in the case of the KCl molecule, here also it takes an energy of 0.54 eV to
transfer one electron from the K atom to the Cl atom (see Sec. 5.6). Representing
the van der Waals repulsion, which becomes important when the electronic
wave functions overlap, by b/Rn, the energy per ion pair is given by

14.4 α b
Epair = 0.54 − + n (8.3)
R R
where α ≈ 1.75, R is in Å and the energy is in eV (the energy in Eq. (8.3) is with
respect to the energy of the isolated neutral atoms, taken to be zero). The energy
per pair, Epair, is known as cohesive energy per ion pair. The equilibrium condition
for the ions is
dE
=0 (8.4)
dR
258 Elements of Modern Physics

which implies
14.4 α  1
Epair = 0.54 −  1−  (8.5)
R0  n
R0 being the equilibrium separation. From a knowledge of R0 ≈ 3.14 Å and
Epair ≈ – 6.67 eV, one gets n ≈ 10 which is in approximate agreement with what
is expected from a detailed theoretical analysis.
The ionic bind is quite strong, the binding per pair of atoms being about
5 eV. This leads to rather high melting temperatures for ionic crystals, e.g. 801°C
for NaCl.

Covalent Bonds
Covalent bonds discussed earlier in the context of molecules (see Sec. 5.6), are
important in the formation of solids also. These bonds arise when two atoms
find it energetically favourable to share their electrons (this is particularly true
for identical atoms such as two Cl atoms). In such cases the shared electrons are
found preferentially between the atoms, with opposite spins as required by Pauli’s
principle, providing each atom with a complete shell. These bonds are especially
important in group IV elements with half-filled shells, e.g. C, Si, Ge, etc. which
can accommodate eight electrons in their outermost shells. These atoms can
form four covalent bonds which are directional. Every atom may be considered
to be at the centre of a tetrahedron, sharing an electron with each of the four
nearest neighbours which are at the four corners of the tetrahedron. Since a
covalent bond involves the sharing of two electrons, one from each atom, this
allows the atoms to have closed shells with eight electrons.
Covalent bonds are usually quite strong (binding energy is a few eV per
bond) and directional. As a consequence, crystals with covalent bonds are hard
but brittle, and have a high melting point. This is especially so for diamond,
which has a cohesive energy of about 7 eV per atom. It is the hardest material
known, and has a melting point of more than 3550°C.
If the bonds are between different types of atoms, the electrons may spend
more time with one of the atoms, so that the bonds are partially ionic and partially
covalent. An important example of this is ZnS (zinc blende) which also has
tetrahedral structure but with different atoms at the centre and the corners, e.g.
Zn at the centre and S at the corners. In this case, the bonds are partially ionic
and partially covalent.

Metallic Bonds
As was pointed out in the discussion of the free-electron theory of metals, the
valence electrons in metallic atoms, being loosely bound, escape from the atom.
These essentially free electrons provide a medium of negative charge which
Solid State Physics 259

helps to bind the positive ions. That this leads to a lower energy state can be
seen from the following arguments.
An electron in an isolated atom is confined to a small volume around the
nucleus. This confinement gives rise to an uncertainty in the momentum,
∆p ~ /r, where r is the radius of the atom. Consequently, the electron has a
fairly substantial amount of kinetic energy, of the order of several eV. However,
in the crystalline, metallic state, the electrons are essentially free to be anywhere
in the entire crystal. As a result, there is a considerable reduction in their kinetic
energy. This is the source of metallic bonding.
The bond between two metallic atoms is somewhat weaker than ionic or
covalent bonds. This leads to relatively low melting points, for example 63°C
for K. However, the cohesive energy of metals is fairly large since each valence
electron interacts with several ions. The metallic bonds are not directional which
allows the planes of atoms to slide over each other quite easily. Hence, metals
are found to be ductile and malleable rather than brittle. The existence of
essentially free electrons gives rise to high electrical and thermal conductivity
for metals.

Van der Waals Bonds


When two neutral atoms approach each other, there is an attractive force between
them because of the induced electric dipole moments (see Sec. 5.6). These forces,
through weak, are important for atoms which do not form ionic, covalent or
metallic bonds with each other. They lead to the solidification of inert gases He,
Ne, Ar, Kr, Xe, and some organic molecules, such as methane. The binding due
to van der Waals forces is usually weak which results in a low melting point,
e.g. melting point of Ar is – 189°C, for these solids. They are also soft, electrically
insulating and generally insoluble.

Hydrogen Bonds
The hydrogen atom has only one electron and would normally form a covalent
bond with only one other atom. However, if the other atom is strongly
electronegative, the electron may be transferred to the electronegative atom.
The remaining proton being small in size, about 10–15 m compared to the usual
atomic size of 10–10 m, can bind only two neighbouring negative ions (the two
spheres would be glued together by the proton in-between). This gives rise to
what is known as the hydrogen bond which connects only two atoms.
The hydrogen bond is important in the formation of ice and in the
polymerization of hydrogen fluoride.
260 Elements of Modern Physics

It may be noted that in a real crystal, the actual binding is due to a mixture
of different types of bonds though one of them may be predominant.

8.2 CRYSTAL STRUCTURES


It is convenient to discuss the structures of crystals in terms of space lattices. A
space lattice is an array of lines, which divides the space into identical volumes.
These volumes fill the space completely and are known as unit cells. The
intersections of the lines are the lattice points and each lattice point is usually
associated with an atom or a group of atoms. Therefore may also be atoms at
the centres of the cells or the faces of the cells. Obviously, the choice of a unit
cell is not unique. It is usually dictated by convenience. For example, in the two
dimensional space lattice shown in Fig. 8.2, the unit cell may be taken to be
ABCD or ABEC though it may be more convenient to choose ABCD. The
essential requirement is that repetition (or equivalently, translations) of the unit
cell should cover the entire space lattice. A primitive cell is the unit cell with the
smallest volume. The choice of a primitive cell also in not unique. It is usually
chosen by convenience or tradition.

D C E

A B

Fig. 8.2 A two-dimensional space lattice. Unit cell may be taken to be


ABCD or ABEC. With A as the origin, the lattice point E is
represented by the lattice vector 2a + b.
The important characteristic of a space lattice is that every lattice point has
an identical surrounding. This severely constraints the possible space lattices.
It was sown by Bravais (1848) that there are only 14 space lattices (Fig. 8.3). It
is important to appreciate that while there are only 14 space lattices, there are a
very large number of crystal structures since different patterns of atoms can be
associated with a given lattice point.
A few details of some of the more common lattices are discussed here. It
may be noted that the centres of atoms in various lattices are located at the
corners or the centres indicated, and the atoms in some sense can be regarded as
touching the nearest neighbours. The number of nearest neighbours is called
the coordination number, and give an indication of the closeness of the packing
of the atoms. Another quantity of interest is the packing fraction, which is the
fraction of the available volume occupied by the atoms. With the assumption
Solid State Physics 261

that the atoms are hard spheres and that the nearest neighbours touch each other,
we can deduce the packing fraction for a given crystal pattern.
Simple cube: The simple cubic structure is perhaps the simplest possible
form, tough polonium is the only element which has this structure. The reason
for this is that the cubic structure is rather an open form with a considerable
amount of empty space.

Cubic P Cubic I Cubic F

Tetragonal P Tetragonal I

Orthorhombic P Orthorhombic S Orthorhombic I Orthorhombic F

Monoclinic P Monoclinic C Triclinic P

Trigonal R Trigonal and hexagonal P

Fig. 8.3 The 14 Bravais or space lattices; P cells are primitive, I


cells are body-centred, F cells are face-centred, C cells are base-centred,
and R cells are rhombohedral.
The coordination number for the simple cubic structure is 6. For this
structure, each atom at the corner is shared by eight cubes so that the volume
occupied by the atoms in a cube is 4/3 πR3 (R is the radius of the atom).
262 Elements of Modern Physics

Since the length of the cube is 2R, packing fraction f is


π
f= ≈ 52% (8.6)
6
Body centred cube: A closer packing is provided by the body centered cubic
(bcc) lattice. Many elements crystallize in this form, e.g., Fe, Cr, Mo, W, Li, K,
Na, Rb, Cs, etc. The coordination number for these structures is 8. There are
8π 3
two atoms which belong to each cell so that the volume occupied is R.
3
Since the diagonal of the cube is 4R, the volume of the cube is (4R/31/2)3 and the
packing fraction f is

31/ 2 π
f= ≈ 68% (8.7)
8
It should be noted that some binary compounds such as CsCl, have a body
centred cubic structure, with one type of atoms (e.g., Cl) at the centres, and the
other type of atoms (e.g.,) at the corners. However, since these atoms are different
the lattice is a simple cubic lattice with a pair of Cs and Cl atoms associated
with each corner.
Close-packed structures: The most efficient packing of atoms in a plane is
the one shown in Fig. 8.4(a), with each atom touching six atoms in the plane.
There are two ways in which the successive layers can be placed over one
another. The second layer can have atoms in positions marked as B (or
equivalently C). The third layer may have atoms in positions A leading to the
hexagonal close-packed (hcp) structure [Fig. 8.4(b)]. The positions of the atoms
in the successive layers of this structure may be indicated by ABABAB... .
Examples of elements which have this structure are Cd, Mg, Ti, Zn, etc. On the
other hand, the third layer may have atoms in positions C leading to the face-
centred cubic (fcc) structure. The positions of the atoms in the successive layers
may be indicated by ABCABC... . The close packed layers of the face-centred
cubic structure, are in the body diagonal planes [Fig. 8.4(c)]. Some examples of
elements that have the face-centred cubic structure, are Cu, Ag, Au, Al, Pd, and
Pt. An interesting example is that of NaCl, KCl, etc. (Fig. 8.1) which consist of
two interpenetrating fcc sublattices, one of them made up of Na+ or K+ ions and
the other of Cl– ions, as shown in Fig. 8.1.
Both the close-packed structures, the hexagonal close-packed structure and
the face-centred cubic structure have a coordination number of 12 and have the
highest packing fraction. The packing fraction for the fcc structure is obtained
16π 3
by noting that four atoms belong to each cell and occupy a volume of R .
3
The length of a side of the cube is 23/2 R so that the packing fraction f is
Solid State Physics 263

π
f= ≈ 74% (8.8)
3(21/ 2 )
For the hcp structure, six atoms belong to each cell and occupy a volume of
8πR3. The area of the base of the hexagon is 6 (31/2) R2 while the height of the
hexagon is 4(2/3)1/2 R, from which the packing fraction is again found to be
π
f= ≈ 74% (8.9)
3(21/ 2 )

A A A A
B B B
C C
A A A
B B
C C C
A A A A

(a) (b)

(c)

Fig. 8.4 Close-packed structures. (a) different layers, (b) hexagonal


close-packed, (c) body-diagonal plane of face centred cube.
Equal packing fractions for the hcp and fcc structures is to be expected
since both of them have close-packing. In each specific case, that structure
which has a lower total energy is chosen.
The diamond structure: The diamond structure is of considerable importance
since many practically important elements belonging to group IV, such as C, Si
and Ge, crystallize in this form. It may be regarded as a superposition of an fcc
lattice and another fcc lattice obtained by translating the first lattice along the
body diagonal by a quarter of the diagonal [see Fig. 8.5 (a). The structure however
is not close-packed since a corner atom of one face-centred cube is not in contact
with an atom in a face with a corner atom of the second cube. A more useful
264 Elements of Modern Physics

concept is to regard each atom of one cube (e.g., a corner of the second cube) as
being at the centre of a tetrahedron formed by the four nearest neighbours which
belong to the other cube [see Fig. 8.5(b)]. This also brings out the fact that each
atom in this structure usually forms four covalent bonds with its four nearest
neighbours. Some binary compounds such as ZnS also crystallize in the diamond
structure, with Zn atoms forming one fcc and S forming the other fcc. In these
cases, the bonds are partly ionic and partly covalent. Other compounds with
this structure are SiC, CdS, InSb, GeP, etc.

(a) (b)

Fig. 8.5 Diamond structure, (a) circles represent an fcc lattice and
crosses represent the fcc lattice obtained by translating the
first lattice along the body diagonal by 1/4 the diagonal,
(b) a typical tetrahedron in the diamond structure.

The coordination number for the diamond structure is 4. The packing fraction

may be obtained by noting that eight atoms belong to the cell (4 are inside,
1 1
× 6 in the faces and × 8 at the corners) and that the diagonal is equal to
2 8
8R. This leads to a packing fraction f of
32π 3
f= R /(8R /31/ 2 )3
3
31/ 2 π
= ≈ 0.34 (8.10)
16
which means that the packing in diamond structure is rather loose.

Directions and Planes in Crystals


Let the sides of the unit cell which is taken to be a parallelopiped, be designated
by the vectors a, b and c. Let these vectors intersect at a corner A which is taken
to be the origin. The position vector of any lattice point may be represented by
a linear combination of a, b and c, as
r = y′a + b′b + w′c (8.11)
Solid State Physics 265

where u′, v′ and w′ are integers. For example, the point E in the two dimensional
lattice in Fig. 8.2 is represented by 2a + b. The direction of the vector is then
represented by a set of numbers [u, v, w] which is obtained by multiplying u′, v′
and w′ by the lowest common denominator. If some of the components are
negative, this is indicated by a bar over the number, e.g., if u is negative, the
direction is given by [| u |, v, w] .
A crystal plane is characterized by the intercepts u′, a, v′ b and w′ c along
the three axes. The reciprocals 1/u′, 1/v′ and 1/w′ are reduced to the simplest
integers h, k, l by multiplying the reciprocals by their lowest common
denominator. The plane is then denoted by the set of numbers (h, k, l), called
the Miller indices. If some of the intercepts are along the negative axes, this is
indicated by a bar over the corresponding index, e.g., if u′ is negative, the Miller
indices are (| h |, k , l ) . It is clear from this that parallel planes (with the
corresponding intercepts having the same sign) are represented by the same
Miller indices. When an intercept is at infinity, the corresponding Miller index
is zero. For example, (1, 0, 0) represents a plane parallel to the yz plane, with a
positive intercept along the x-axis, while a plane with intercepts – 2a, 3b and
parallel to the z-axis, is denoted by the Miller indices ( 3 , 2, 0).

Diffraction by a Lattice
Crystal structures are determined experimentally by analysing the diffraction
of x-rays by a crystal. Since x-rays have a wavelength of about 1 Å, the atoms in
a crystal serve as a grating and produce diffraction maxima. The measurement
of the positions and intensities of these maxima gives information about the
crystal structure. The conditions for the maxima were obtained in terms of the
Bragg condition in Eq. (2.38) by considering the superposition of scattering
from different planes. This is an over-simplified picture. Here a more rigorous
and complete derivation of the maxima is presented by looking at the
superposition of scattered waves from different atoms.
Consider a radiation described by B exp [i (k0.r – ωt)], | k0 | = 2π/λ, incident
on a lattice with atoms at points
rn = ua + vb + wc, u = 0, ..., n1; v = 0, ..., n2;
w = 0, ..., n3 (8.12)
The scattered wave will have the same wavelength as the incident beam,
but will propagate in some other direction. Since the scattered beam has the
same phase as the incident beam at the point of scattering, it is given by
A= ∑ B′ exp [ik .r ] exp [ik.(r − r )],
n
0 n n


| k | = | k0 | =
λ
266 Elements of Modern Physics

= B′ exp [ik . r] ∑
u , v, w
exp [iq . (ua + vb + wc)] ,

q = k0 – k (8.13)
The summation leads to

 sin [ 12 a . q (n1 + 1)]   sin [ 12 b . q (n2 + 1)] 


| A | = B′    
 sin ( 12 a . q)  sin ( 12 b . q) 

 sin [ 12 c . q (n3 + 1)] 


×   (8.14)
 sin ( 12 c . q) 
This amplitude is large, proportional to B′ (n1 + 1) (n2 + 1) (n3 + 1), if
1
a . q = l 1π
2
1
b . q = l 2π (8.15)
2
1
c . q = l 3π
2
where l1, l2 and l3 are integers not all simultaneously equal to zero.
The above conditions for maxima are expressed conveniently in terms of
what is known as the reciprocal lattice. The reciprocal lattice is defined by a
b×c c×a a×b
unit cell with basic vectors 2π , 2π , 2π . Then,
a.b×c b.c×a c.a×b
Eq. (8.15) implies that a maximum is observed if
q = k0 – k
 b×c   c×a   a×b 
= l1  2π  + l2  2π  + l3  2π 
 a.b×c   b.c×a   c.a×b 

| k | = | k0 | = (8.16)
λ
where l1, l2 and l3 are integers but | k0 – k | ≠ 0. Thus, the condition for a diffraction
maximum is that the associated q vector is a reciprocal lattice vector.
It is seen that q is perpendicular to a/l1 – b/l2 and b/l2 – c/l3. Hence it is
perpendicular to a plane in the lattice space which has intercepts a/l1, b/l2, c/l3
along the three axes. This plane has Miller indices (l1/n, l2/n, l3/n) where n is the
highest common factor of l1, l2, l3. For the situation in which a, b, c are orthogonal
Solid State Physics 267

the distance between these planes is (see Example 2 of Sec. 8.8)


−1/ 2
 l2 l2 l2 
d =  21 2 + 22 2 + 23 2  (8.17)
n a nb nc 
Also the magnitude of | q | is
| q | = 2 | k | sin θ
1/ 2
 l2 l2 l2 
= 2π  12 + 22 + 3  (8.18)
a b c2 
where 2θ is the angle between k0 and k. Using | k | = 2π/λ, gives the Bragg
condition
2d sin θ = nλ (8.19)
There are two general diffraction methods used for studying crystal structure.
In the method of Laue, a single crystal is illuminated by x-radiation with a
continuous spectrum. Since the Bragg condition or equivalently Eq. (8.16) will
be satisfied for some wavelengths, diffraction maxima will appear on the
photographic plate. This method is useful for determining the orientation of
crystal planes. In the second method, known as the powder method, many small
crystals bound together in a wire or rod are illuminated by an x-ray of a given
frequency. Some of the small crystals will have the correct orientation to produce
diffraction maxima. The diffraction angles and the intensities of the maxima
provide information about the cell structure and dimensions. It may be noted
that structures can also be deduced from the diffraction of neutron and electron
beams (see Sec. 2.4). The considerations of x-ray diffraction can be extended to
these cases if for wavelength we use the corresponding de Broglie wavelength
of the particle.

8.3 BAND THEORY OF SOLIDS


While the free-electron theory of metals is able to explain several electromagnetic
and thermal properties of metals, its validity is restricted. It is not able to expeain
the fact that the Hall coefficient (which essentially gives the sign of the charge
carriers, see Example 3 in Sec. 8.8) of alkaline earth metals (Be, Mg, Ca, Sr,
etc. which are divalent metals) is positive. Nor can it explain the electomagnetic
properties of insulators and semiconductors, e.g. increase in conductivity with
temperature of semiconductors, insulators becoming good conductors in the
presence of electromagnetic radiation. The description of these properties
requires a more detailed knowledge of the energy levels of the electrons in
solids. In particular, the interaction of the electrons with the ions, which gives
rise to energy bands, should be included. Energy bands are central to the
understanding of the properties of solids. They can be discussed in terms of
(i) tight-binding approximation, (ii) nearly free electron approximation. These
268 Elements of Modern Physics

are applicable in different situations but lead to qualitatively similar results.


Brief descriptions of these approximations are given here.

Tight Binding Approximation


In this approximation, the crystal is formed by bringing together N atoms. When
the atoms are separated by a large distance, their energy levels are the same i.e.
degenerate. As they come close together, these levels are perturbed by
interactions between the atoms and the degeneracy is removed. Thus, there will

E 3p

3s

0 5 10
Interatomic separation in Å
(a)

30

20
E in eV

10
Energy gap

– 2p/a – p/a p/a 2p/a


1st Brillouin zone

2nd Brillouin 2nd Brillouin


zone zone
(b)

Fig. 8.6 (a) Energy bands in the tight binding approximation, (b) Brillouin
zones and energy bands in the nearly free electron approximation.
a = 2Å, V1 = 1 eV.
be 2N levels for an s state and 6N levels for a p state (two spin states for each
electron). Since N is very large, one gets bands of closely-spaced energy levels,
Solid State Physics 269

called energy bands which in general are separated by some energy gaps. The
spread for the inner-lying levels will be small since they are not greatly influenced
by the presence of other atoms, and it will be larger for the outer levels [sec Fig.
8.6 (a)]. Since the spread is determined by the structure of each atom and the
interatomic distance, the density of levels in a band will increase with the number
of atoms (keeping the interatomic distance constant). It may happen that some
outer-lying bands overlap and this will have a profound influence on the
properties of the crystal.
The tight-binding approximation is reliable mainly for narrow bands of
low-lying energy levels, for which the effect of the interatomic interaction is
small.

Nearly Free Electron Approximation


Alternatively, the electrons may be regarded as moving in a periodic potential
which describes their interaction with the ions. Here, the interaction between
the electrons is neglected. For simplicity, the influence of such an interaction is
considered in the one-dimensional lattice.
A one-dimensional periodic potential with period a can be written as
1
V(x) = 2 ∑V
m
m exp (2π mix/a), m = 0, ± 1, ... (8.20)

with Vm = V– m. The effect of only the | m | = 1 term is considered, with


V1(x) = V1 cos (2π x/a) (8.21)
The wave function will satisfy the equation

2 ∂2
− ψ( x) + V1 cos (2π x/a) ψ ( x) = Eψ ( x) (8.22)
2m ∂x 2
The general form of the solutions of such an equation is given by the Bloch
theorem which states that ψ(x) can be written in the form
ψ(x) = exp (ikx)uk(x) (8.23)
where uk(x) is periodic with period a. If the number of lattice points is N, a
periodic boundary condition is imposed on ψ that ψk(Na) = ψk (0). This is
equivalent to closing a linear chain and implies that the allowed values of k are

2πn
k=, n = 0, ± 1, ... (8.24)
Na
These allowed values of k are conveniently divided as follows, into what
are known as the Brillouin zones:
2πn
k= , n = 0, ± 1, ± 2, ... N /2,1st Brillouin zone ,
Na
270 Elements of Modern Physics

n = – N/2, ± (1 + N/2), ± (2 + N/2), ..., N,


2nd Brillouin zone etc. (8.25)
where for simplicity of notation it has been assumed that N is an even number.
It is important to note that each Brillouin zone has 2N allowed states (including
a factor of 2 for the electron spin).
It is assumed that in the presence of the interaction, the wave function is a
superposition of mainly two plane waves,

eikx eik ′x
ψ(x) = A + B (8.26)
L1/2 L1/2
where for the sake of definiteness we take k′ ≤ k. We substitute this expression
in Eq. (8.22), multiply the equation by e– ikx or e– ik′x and integrate to obtain

2k 2 1
A + V1 B = EA
2m 2 2π (8.27)
2 2
k − k′ =
 k′ 1 a
B + V1 A = EB
2m 2
In these equations, the interaction connects only the states which satisfy
the condition

k – k′ = .
a
This is related to the fact that for scattering by a one-dimensional lattice, a
Bragg maximum is obtained for precisely the same condition [see Eq. (8.15)].
The interaction which gives rise to the scatting, is also responsible for mixing
the two plane-wave states in Eq. (8.26). Solving the two homogeneous equations
gives
1/2
2 2   4 ( k 2 − k ′2 ) 2 1 2 
E= ( k + k ′2 ) ±  + V1  (8.28)
4m  16 m 2 4 

2   2 (k ′2 − k 2 )   2 (k 2 − k ′2 ) 2 1 2  
1/2
B
=  ± + V1  
A V1  4m  16 m 2 4  
 

where k′ = k – . Clearly, the changes introduced, by the perturbation V1 are
a
significant mainly for | k | ≈ | k′ | and hence for k ≈ π/a. For k < π/a, the negative
sign corresponds to | A | ≥ | B |, which is therefore associated with the 0 ≤ k ≤ π/
a branch of the solutions, while for k > π/a, the positive sign corresponds to
Solid State Physics 271

| A | ≈ | B |, which is associated with the π/a < k ≤ 2π/a branch of the solutions.
The most important feature of these branches is that there is an energy gap of V1
at k = π/a between the two Brillouin zones described by these branches, and
there are 2N allowed states (including the negative k states) in each zone. This
is qualitatively similar to the energy bands. It also follows from Eq. (8.28) that
since k – k′ = 2π/a, ∂E/∂k = 0 at k = π/a. The energy bands are illustrated in
Fig. 8.6(b). The gap at k = 2π/a comes from the V2 term etc.
In the case of the three dimensional problem the condition in Eq. (8.27) is
modified to read k – k′ = q where q is a reciprocal lattice vector defined in
Eq. (8.16). The corresponding Brillouin zones are obtained form the requirement
that | k | = | k′ | at the boundaries. This leads to the condition 2k . q – q2 = 0
which defines the boundaries as some plane perpendicular to the reciprocal
lattice vectors q. The perturbations of energies at the edges of the Brillouin
zones lead to distorted equal-energy surfaces in the k-space. In particular, the
electrons in a crystal occupy the lowest energy levels at T = 0 K, subject to the
Pauli principle. The surface of the region of all occupied states in the k-space is
called the Fermi surface. Since the energy distortions are prominent mainly
near the surfaces of the Brillouin zones, the nearness of the Fermi surface to the
surfaces of the Brillouin zones, and its shape are of importance for the
understanding of the properties of electrons in crystals in general and metals in
particular.
Another property of the energy bands worth noting is the density of states.
The free particle energy density is proportional to E1/2 [see Eq. (7.29)]. However,
the periodic potential distorts the energy levels in such a way that there are no
energy levels in the energy gaps and the density of states goes to zero at the
bottom and the top of the energy band.
For obtaining information about the energy bands and the density of states,
x-rays are used to knock out electrons in the energy bands. An analysis of the
x-rays emitted when electrons for higher energy bands undergo transitions to
these vacant levels, provides information about the energy bands and the density
of states.

Effective Mass
A useful idea in the band theory of solids is that of the effective mass of an
electron in a solid. One is led to this idea in an effort to simulate an electron in
a periodic potential by a free electron but with an effective mass.
The energy of a free electron is given by

2 k 2
E= (8.29)
2m
272 Elements of Modern Physics

Therefore its mass may be defined as

2
m= (8.30)
(∂ 2 E / ∂k 2 )
This definition may be extended to apply to a particle in a periodic potential
so that the effective mass of an electron in a one-dimensional crystal is

2
m* = (8.31)
(∂ 2 E / ∂k 2 )
Here, however, m* is a function of k. As can be seen from Fig. 8.6(b), m* is
positive near the bottom of each zone, negative near the top of each zone and is
infinite at the point of inflection. The large effective mass can be interpreted as
being due to the strong binding force between the electron and the lattice for
some k values, which makes it difficult to move the electron. The negative
mass may be interpreted in terms of the Bragg reflection when k is close to π/a,
2π/a, etc. on account of which a force in one direction, because of reflection,
leads to a gain of momentum in the opposite direction.
A detailed analysis shows that when an external electric field E is applied,
the acceleration of the electron is given by

 1 ∂2 E 
a = − e | E |  2 2 
 (8.32)
  ∂k 
which again simulates a free particle motion with the effective mass m* given
in Eq. (8.31). For a three dimensional crystal, the anisotropy is taken into account
in the relation

∑ m *ij a j = – e Ei (8.33)
j

2
m*ij =
(∂ 2 E / ∂ki ∂k j )
The concept of effective mass provides a satisfactory description of the
charge carriers in crystals. In normal circumstances, the conduction of current
is by the electrons, particularly in the case of crystals in which an energy band
is only partially filled (e.g., alkali metals for which the band is only half-filled).
On the other hand, consider a band which is nearly full except for a few vacancies
near the top of the band. This situation of a full band with the vacancies in the
negative charge, negative mass states may be regarded as corresponding to the
presence of positive charge, positive mass particles. These hole states with
positive charge, also act as charge carriers. In elements like Be, Zn, Cd, etc. It is
Solid State Physics 273

the hole states which are the dominant charge carriers and hence they have
positive Hall coefficients (see Example 3 in Sec. 8.8).

Metals, Insulators and Semiconductors


The existence of energy bands, i.e., the allowed bands consisting of allowed
energy levels and forbidden bands which are the gaps between the allowed
bands, provides a simple explanation for the general properties of metals,
insulators and semiconductors.
Consider the order in which the energy levels of the energy bands are filled.
To start with, the electrons in the inner shells fill the corresponding narrow
energy bands, and are not influential in determining the general properties of
the crystal. The relevant electrons are the valence electrons and they occupy
what is known as the valence band. At 0 K, the valence electrons occupy the
lower levels of the valence band. It is the position of the higher levels which
determines the conductivity properties of solids.
There are three important cases of the available higher energy levels, which
are shown in Fig. 8.7. In the first case the electrons fill only the lower half of the
valence band at 0 K. This happens for example, in the case of sodium for which
there are N electrons and 2N levels in the 3s band. At finite temperature some of
these electrons are excited to higher energy levels. In these cases of partially
filled valence bands, an external electric field will transfer some of these electrons
to the nearby higher energy vacant states and in addition provide additional
velocity in the direction of the field. Such crystals are good conductors of
electricity and are metals. It is worth noting that partial filling of the valence
band, which for metals is also the conduction band, is observed if there is an
odd number of electrons in the valence shell, e.g., sodium, or if the energy
bands overlap. Overlapping of bands is found, for example, in the case of Mg,
Zn, etc. which are good conductors.
In the second case [Fig. 8.7(b)], the valence electrons completely fill the
valence band which is separated by a large energy gap ∆E from the conduction
band. For example, the covalent bonding in diamond splits the 2s and 2p levels
into two bands (each of which is a mixture of 2s and 2p states) which are separated
by an energy gap ∆E of about 6 eV, with the valence electrons filling the lower
band. When an electric field is applied to such a crystal, there is no significant
change in the states of the valence electrons since a transition to an available
level requires an energy which is at least equal to the energy gap, which is
about 6 eV in the case of diamond. Such solids are insulators. The description
in terms of energy bands implies that if a radiation of high enough frequency is
incident on an insulator, the electrons in the valence band may absorb the
radiation and undergo transition to the conduction band. These excited electrons
can easily change their velocity since many states are available to them, and act
274 Elements of Modern Physics

as efficient carriers of current. In this situation, an insulator can become a good


conductor and the effect is known as photoconductivity. It may be observed that
even at finite temperatures only a very small number of electrons are in the
conduction band (energy gap ∆E is very large compared to kT), and an insulator
remains a poor conductor of electric current.
3s + 3p Small energy
3s
2s + 2p 3s + 3p gap DE
Energy
2p gap DE
2s + 2p 2p

2s 2s

1s 1s 1s
Metal Insulator Semiconductor
(a) (b) (c)

Fig. 8.7 Schematic illustration of the energy bands for (a) metals,
(b) insulators with an energy gap, and (c) semiconductors with a small gap.

The third case [Fig. 8.7(c)] is qualitatively similar to that of insulators except
that the energy gap between the conduction band and the valence band is much
smaller, 1.1 eV for is and 0.7 eV for Ge. At 0 K, all the electrons are in the
valence band and the conduction band is empty, and the solid behaves like an
insulator. However, at room temperatures, an appreciable number of electrons
are excited to the conduction band (kT ≈ 0.026 eV compared to the energy gap
which is about 1 eV). These electrons can carry charge. Simultaneously, the
electrons in the valence band can undergo transitions to the vacant states left
behind by the transitions to the conduction band. Effectively, the holes (or the
vacancies) serve as carriers of positive charge. The conductivity of these solids
lies between those of metals and insulators, and they are known as
semiconductors.
An important characteristic which distinguishes metals from semiconductors
is the temperature dependence of their conductivities. As the temperature is
raised, more and more phonons are excited, which can scatter electrons and
hence reduce their mobility. Therefore the conductivity of metals generally
decreases as temperature increases. However, in the case of semiconductors,
the decrease in the mobility is more than compensated by the increase in the
number of carriers, electrons as well as holes. As a result, the conductivity of
semiconductors increases (at moderate temperatures) as temperature increases.

8.4 SEMICONDUCTORS
As mentioned before, semiconductors are crystals whose valence band is
completely filled but which have a small energy gap (∆E ~ 1 eV) between the
Solid State Physics 275

conduction band and the valence band. Their conductivity is in-between that of
metals (~ 108 Ω–1 m–1) and that of insulators (~ 10–11 Ω–1 m–1), and increases
with temperature. Because of the narrowness of the energy gap and the proximity
of the energy levels of the impurity to the valence and conduction bands,
semiconductors have rather striking electronic properties which make them very
useful in the development of sophisticated electronic equipment. Here the
positions and populations of the semiconductor energy levels which determine
their electronic properties are discussed.
It is useful to classify semiconductors into two categories. The class of
semiconductors which are pure, such as silicon, germanium (which are group
IV elements), GaAs, PbS, etc., are known as intrinsic semiconductors. In the
second class of semiconductors known as extrinsic (or impurity) semiconductors,
the properties of the semiconductors are modified by the introduction of carefully
controlled amounts of impurities.
To see how the impurities affect the properties of semiconductors, consider
the specific examples of silicon and germanium. These are group IV elements
which have diamond structure in which each atom has a covalent bond with
each of the four nearest neighbours at the corners of a tetrahedron. If a small
amount of a group V element, such as phosphorus, arsenic or antimony, is
introduced during the formation of the crystal, the group V atom will take the
place of one of the group IV atoms and form four covalent bonds with the
nearest neighbours. However, since it has five electrons in the valence shell,
the fifth electron is only weakly bound to the atom. They i.e., the ‘fifth’ electrons
occupy localized energy levels which are just below the conduction band [see
Fig. 8.8(a)]. The electrons in these levels are easily excited to the states in the
conduction band and serve as current carriers. Since group V atoms donate
electrons for conduction they are known as donors, and the new energy levels
just below the conduction band as donor levels. The charge carriers in this case
being negatively charged, the corresponding extrinsic semiconductors are known
as n-type semiconductors. Alternatively, if a small amount of a group III element
such as boron, aluminium or indium is introduced, the group III element will
form only three covalent bonds with the nearest neighbours. Thus, there is a
vacancy or a hole associated with each of these atoms. Since an electron in
these states would be fairly tightly bound, the vacant states provide localized
energy levels which lie just above the valence band [see Fig. 8.8(b)]. The
neighbouring electrons can easily be transferred to these levels, as a result of
which holes are created in the valence band. Since the states near the top of the
band have negative mass (see Sec. 8.3), these holes behave as positive mass,
positive charge carriers. The group III impurity atoms are known as acceptors
and the new energy levels just above the valence band as acceptor levels. The
charge carriers in this case being positively charged, the corresponding extrinsic
semiconductors are known as p-type semiconductors.
276 Elements of Modern Physics

Extra a hole
electron
P B

Conduction band Conduction band


Donor level

Acceptor level
Valence band Valence band
(a) (b)

Fig. 8.8 Extrinsic semiconductors (a) n-type with P as the donor atom,
(b) p-type with B as the acceptor atom.
The electronic properties of the semiconductors are influenced by the
positions of the Fermi energy and the concentrations of the charge carriers.

εf for Intrinsic Semiconductors


In an intrinsic semiconductor, every electron transferred to the conduction band
leaves behind a hole. Therefore, the total number of electrons in the conduction
band is equal to the total number of holes in the valence band.
For calculating the total number of electrons in the conduction band, the
number of states in the conduction band, per unit volume is taken to be [see Eq.
(7.86)]
4π(2me *)3 / 2
dNc = 3
(ε − εc )1/ 2 d ε (8.34)
h
Here me* is the effective mass of the electrons in the conduction band and
εc is the lowest energy in the conduction band. Therefore, the number of electrons
in the conduction band, per unit volume, is

4π(2me *)3 / 2 (ε − εc )1/ 2 d ε
nc =
h3
∫ e
( ε − ε f ) / kT
+1
(8.35)
εc

Assuming that (εc – εf) >> kT, the unit term in the denominator can be
neglected and the integral evaluated [substitute (ε – εc) = x2]. This then gives

2(2πme* kT )3 / 2
nc = exp [(ε f − εc ) / kT ] (8.36)
h3
Solid State Physics 277

For obtaining the number of holes in the valence band it is noted that the
probability that a state is not occupied by an electron, is
1
Ph = 1 −
exp [(ε − ε f ) / kT ] + 1
1
= (8.37)
exp [(ε f − ε) / kT ] + 1
In analogy with Eq. (8.34), the number of hole states in the valence band
per unit volume, is taken to be

4π(2mh*)3 / 2
dNv = 3
(εv − ε)1/ 2 d ε (8.38)
h
where mh* is the effective mass of the holes in the valence band and εv is the
highest energy in the valence band. The number of holes in the valence band,
per unit volume, is
εv
4π(2mh*)3 / 2 (ε v − ε)1/ 2 d ε
nh =
h3
∫ exp [(ε f − ε) / kT ] + 1
(8.39)
−∞

Assuming that (εf – εv) >> kT, the unit term in the denominator can be
neglected, and this leads to

2(2πmh* kT )3 / 2
nh = exp [(εv − ε f ) / kT ] (8.40)
h3
It is interesting to note that

4(4π2 me* mh*)3 / 2


nenh = 6
(kT )3 exp [(εv − εc ) / kT ] (8.41)
h
which is independent of εf but depends on the energy gap (εc – εv).
For an intrinsic semiconductor, ne = nh so that
(me*)3/2 exp [(εf – εc)/kT] = (mh*)3/2 exp [(εv – εf)/kT] (8.42)
1 3
or εf = (εc + εv) + kT ln (mh*/me*) (8.43)
2 4
At T = 0, the Fermi energy lies halfway between the valence and conduction
bands. For finite temperatures, mh* is usually greater than me*. However, since
εc – εv ≈ 1 eV and kT ≈ 0.026 eV at room temperature, εf increases but slowly
with temperature. The expression in Eq. (8.43) justifies the assumption that
(εc – εf) >> kT and (εf – εv) >> kT at ordinary temperatures.
278 Elements of Modern Physics

The knowledge of εf allows us to calculate the number of carrier electrons and


holes, and the conductivity of intrinsic semiconductors. Substituting for εf in
Eqs. (8.36) and (8.40), gives

2(2πkT )3 / 2 (me* mh*)3 / 4


ne = nh = exp [(εv − εc ) / 2kT ] (8.44)
h3
For deducing conductivity, it is noted that the current density j is given by
j = e( ne ve + nh vh ) (8.45)
where ve and vh are the magnitudes of the average drift velocities of the electrons
and holes, respectively. The conductivity of the semiconductor is therefore given
by
σ = e(neµe + nhµh) (8.46)
where the mobilities µe and µh are defined by
µe = ve /E , µ h = vh /E (8.47)
E being the magnitude of the electric field. Substituting the expressions for
the carrier densities,

2(2πkT )3 / 2 (me* mh*)3 / 4


σ = e(µe + µh) exp [(εv − εc ) / 2kT ] (8.48)
h3
which leads to

 ε − εv  1 3
ln σ = −  c  + ln T + c (8.49)
 2k  T 2
where c is a constant. Here, it has been assumed that the mobilities are
independent of T. Actually, they do vary as a function of temperature. However,
the main variation is due to the 1/T term and a plot of ln σ as a function of 1/T
gives an approximate straight line. The slope of the straight line gives an
estimation of the energy gap (εc – εv) of the semiconductor.
It may be noted me* ≈ 0.25 m, mh* ≈ 0.3 m for Si and me* ≈ mh* ≈ 0.1 m for
Ge, m being the electron mass. These values imply that at a temperature of
300 K, the carrier concentrations are about 2.3 × 1015 m–3 for Si and about 1018
m–3 for Ge. The intrinsic conductivity at this temperature has the values of about
10–4 (Ω m)–1 for Si and about 0.1 (Ω m)–1 for Ge.

εf for Extrinsic Semiconductors


In an extrinsic semiconductor, the donor or acceptor levels play an important
role in the determination of the Fermi energy and the conductivity of the
semiconductor.
Solid State Physics 279

Consider an n-type semiconductor, with Nd number of donors per unit


volume. Then the number of vacancies per unit volume, in the donor levels of
energy εd is

 1 
nd = 1 −  Nd (8.50)
 exp [(ε d − ε f )/kT ] + 1 
From the condition that the number of electrons in the conduction band is
equal to the total number of vacancies in the donor levels and the valence band,
one gets
c0(me*T)3/2 exp [(εf – εc)/kT] = c0(mh*T)3/2 exp [(εv – εf)/kT)

Nd
+ (8.51)
exp [(ε f − ε d ) / kT ] + 1

where c0 = 2(2πk)3/2/h3. Now εc – εd ≈ 0.01 eV for Ge and about 0.045 eV for Si,
and for the cases of practical interest Nd is of the order of 1022 m–3. So, at ordinary
temperatures, most of the electrons in the conduction band are from the donor
levels. For T → 0, the unit term in the denominator can be neglected giving
c0 (me*T)3/2 exp [(εf – εc)/kT] ≈ Nd exp [(εd – εf)/kT] (8.52)
which leads to

1 1  Nd 
εf =(ε d + ε v ) + kT ln  3/ 2 
(8.53)
2 2  c0 (me* T ) 
where c0 = 2(2πk)3/2/h3. At T = 0, the Fermi level lies halfway between εc and εd.
At room temperature, εf is below εd for the cases of interest and most of the
donor atoms are ionized. In this region the number of vacancies, i.e., rhs of
Eq. (8.51) can be taken to be Nd to get

 Nd 
εf = εc + kT ln  3/ 2 
, (ε d − ε f )  kT (8.54)
 c0 (me* T ) 
For example, in the case of Si doped with a donor impurity to the extent of
1022 m–3, the Fermi energy at 300 K is εf ≈ (εc – 0.15) eV. At higher temperatures,
1
a detailed analysis of Eq. (8.51) shows that εf tends to the value (ε + ε ), i.e.,
2 c v
the value for the intrinsic semiconductor. The conductivity for n-type of
semiconductors is mainly due to the electrons in the condition band (at not very
high temperatures) and is given by
σ ≈ eneµe (8.55)
which leads to
280 Elements of Modern Physics

ln σ ≈ – ln {exp [(εf – εd)/kT] + 1} + c1 (8.56)


where c1 is a constant (it is assumed that µe is temperature independent). Plotted
as a function of 1/T, ln σ has a negative slope for large 1/T, i.e., small T(εf > εd),
but flattens out for larger T once (εd – ef) >> kT. For very high temperatures,
intrinsic conductivity begins to dominate and the expression for ln σ tends to
that given in Eq. (8.49).
For a p-type semiconductor with Na number of acceptors per unit volume
the number of electrons in the acceptor levels, per unit volume, is
Na
na = (8.57)
exp [(ε a − ε f )/kT ] + 1
Equating the total number of electrons in the conduction band and the
acceptor levels with the number of holes in the valence band, and proceeding as
before, gives

1 1  Na 
εf = (ε a + ε v ) − kT ln  3/ 2 
for T → 0 (8.58)
2 2  c0 (mh* T ) 
so that at T = 0, the Fermi level is half way between εa and εv. At room
temperature, essentially all the acceptor levels are occupied and so

 Na 
εf = εv – kT ln  3/ 2 
, (ε f − ε a ) >> kT (8.59)
 c0 (mh* T ) 
For Si doped with an acceptor impurity to an extent of 1022 m–3, the Fermi
energy at 300 K is given by εf = (εv + 0.15) eV. At higher temperatures εf tends
1
to the value of (ε + ε ). The conductivity of p-type semiconductors is primarily
2 c v
due to the holes in the valence band, and therefore one has as in Eq. (8.56),
ln σ = – ln {exp [(εa – εf)/kT] + 1} + c2 (8.60)
where c2 is a constant. Plotted as a function of 1/T, the behaviour of ln σ is
similar to that for n-type semiconductors.

εf for pn Junctions
Junctions between p-type and n-type semiconductors play an important role in
the development of semiconductor devices. A pn junction is a junction at the
microscopic level between a p-type and an n-type semiconductor. Such junctions
are developed by the diffusion of impurity atoms.
Solid State Physics 281

p-type n-type
ec ec V
ef –––––

+ ++ + +

ef
ev ev +

(a) Depletion region

(b)
ne e

0
– xp xn

– np e
(c)

Fig. 8.9 The pn junction, (a) before equilibrium, (b) after equilibrium, and
(c) charge density across the boundary.
The Fermi level of an n-type semiconductor is close to εc while that of a
p-type semiconductor is close to εv (Fig. 8.9). Therefore, there are many more
electrons in the conduction band of the n-type semiconductor and many more
holes in the valence band of the p-type semiconductor. As a result, when a pn
junction is formed, electrons diffuse from the n-type to the p-type semiconductor
and occupy the vacant states there. Similarly, the holes diffuse from the p-type
to the n-type semiconductor and allow the electrons to occupy their vacant
states. As a result, there is a narrow depletion region at the boundary where
there are no charge carriers. Instead, there is a thin layer of positive charge on
the n-side (due to positive ions left behind) and a thin layer of negative charge
on the p-side (due to extra electrons occupying the acceptor levels). This double
layer of charges creates a potential difference across the junction which opposes
the flow of electrons from the n-type to p-type and of holes from the p-type to
n-type semiconductor. The flow of electrons and holes stops when the Fermi
energy on the two sides has the same value (see Fig. 8.9). It must be appreciated
that the shifting of the Fermi energy levels is due to the electric potential across
the junction and that the relative positions of the various energy levels on the
two sides, remain unchanged. The potential difference, across the boundary is
equal to the difference in the Fermi levels of the separate n-type and p-type
semiconductors, and is given by

1 1  Nd Na 
V0 = (εc − ε v + ε d − ε a ) + kT ln  2 2 3/ 2 
2 2  c0 (me* mh* T ) 
for T → 0 (8.61)
282 Elements of Modern Physics

 Nd Na 
and V0 = εc – εv + kT ln  2 2 3/ 2 
(8.62)
 c0 (me* mh* T ) 

at room temperature.

The width of the depletion region can be estimated by the following model
calculation. It is assumed that there is a width of xp in the p-type and xn in the
n-type of semiconductor. Using Maxwell’s equation ∇ ⋅ (κε0 E) = ρ, we get

κε0E = ρx + c (8.63)

where κ is the relative permittivity, ρ = ene in the n-type semiconductor and


ρ = – enh in the p-type semiconductor. Since E is zero at the edges of the depletion
region, c = – enexn in the n-region and c = – enhxp in the p-region. Therefore, the
potential difference across the boundary is

 x 0 
e  n
( x + x p ) dx 
κε 0  ∫0 ∫
V0 = – ne ( x − xn ) dx − nh
− xp

 

e
= [ne xn2 + nh x 2p ] (8.64)
2κε0
The condition of overall neutrality gives

nexn = nhxp (8.65)

These two equations lead to

 e   ne nh  2
V0 =    ( xn + x p ) (8.66)
 2 κε n
0  e + nh 

For Si with impurity concentration at 300 K of ne ≈ nh ≈ 1022 m–3, κ ≈ 12, V0


is approximately equal to 0.8 V and xn ≈ xp ≈ 2 × 10–7 m.
The capacitance of the double layer can be calculated by noting that the
charge Q in each layer is e nexn per unit area, so that

 Q 2   ne + nh 
V =     (8.67)
 2 κε 0 e   ne nh 
From this, the variable capacitance per unit area, is

dQ
C=
dV
Solid State Physics 283

1  2 κ ε0 e ne nh 
=   (8.68)
2V 1/ 2  ne + nh 
The variation of C as V–1/2 is the basis of the variable capacitance diodes
(varactors) which are used in frequency locking and frequency modulation
circuits.

8.5 SEMICONDUCTOR DEVICES


The special properties of semiconductors have led to a large number of important
applications. Here a few illustrative examples such as diodes, transistors, solar
cells and semiconductor lasers are discussed, and also the fabrication of these
devices is briefly described.

Semiconductor Diodes
The pn junction can be used as a rectifier, a voltage stabilizer and in high
frequency circuits.
Consider again the pn junction discussed in Sec. 8.4. Though the net current
across the junction is zero, the electrons and the holes diffuse across the boundary
but the flow in each direction is the same. The densities of electrons and holes
in the corresponding states (similarly located with respect to εc or εy) on the two
sides are related by Boltzmann statistics,
ne ( p)
= exp (– eV0/kT) (8.69)
ne (n)
nh (n)
= exp (– eV0/kT) (8.70)
nh ( p)
where V0 is the potential difference across the junction. It follows from these
relations that the product nenh, of the total number of electrons and holes, has
the same value on the two sides.
To be specific, consider the flow of electrons across the boundary. Since
the motion of electrons from p to n is ‘downhill’, the rate of electron flow from
p to n is
Ie (p → n) = c1ne (p) (8.71)
where ne (p) is the total number of electrons in the conduction band on the
p-side. On the other hand the electrons flowing from n to p, face an ‘up-hill’
potential of V0 and only those which have an energy of eV0 will be able to cross
the boundary. The number of such electrons is proportional to ne (n) exp (– eV0 kT),
so that
Ie (n → p) = c2 ne (n) exp (– eV0/kT) (8.72)
284 Elements of Modern Physics

At equilibrium, there is no net flow, so that


Ie (p → n) = Ie (n → p) = I0
and hence c1 = c2 (8.73)
where Eq. (8.69) has been used.
If now an external potential – V is applied, the potential difference across
the boundary is V0 – V. Since the concentrations are unchanged, the current is
I = c2ne(n) exp [– e (V0 – V)/kT] – c1ne(p)
= I0 [exp (eV/kT) – 1] (8.74)
Thus if (– eV) has the sign opposite to that of eV0, i.e., eV is positive, what
is termed as forward bias, the current increases rapidly, whereas if (– eV) has
the same sign as that of eV, i.e., eV is negative, known as reverse bias, the
current is small and quickly tends to the limiting value of (– I0) (see Fig. 8.10).
A similar behaviour is expected for the flow of holes. Effectively, a pn junction
allows current to flow in only one direction, and hence can be used as a rectifier.
p-type n-type

va – v

Forward bias
Without bias

Reverse bias
(a)

I 5
Io
4

1
– eVB/kT

eV/kT

(b)

Fig. 8.10. Rectification by a diode, (a) potential difference across the boundary,
(b) current as a function of eV/kT, V being the applied potential.
Solid State Physics 285

If there is reverse bias |V | >~ |VB| (see Fig. 8.10), Eq. (8.74) for the current is
no longer valid. For |V| > |VB |, a very rapid increase in the current is observed
(Fig. 8.10). There are two reasons for this increase: (i) the large field at the
junction speeds up the few electrons in the p-region near the junction to such
high velocities that they knock out some of the valence electrons into the
conduction band. This process continues repeatedly and a large current is quickly
built-up. (ii) If the potential difference across the boundary is sufficiently large,
the conduction band on the n-side will overlap the valence band on the p-side
(see Fig. 8.9). In this case, it was suggested by Zender that the electrons in the
valence band on the p-side will tunnel across the boundary into the conduction
band on the n-side. The diodes based on these two effects are known as avalanche
diodes or Zener diodes. They are very useful in voltage stabilization circuits.
If the impurity concentration is very high (of the order of one part in a
thousand), the Fermi energy may move into the valence band on the p-side and
into the conduction band on the n-side [Fig. 8.1(a)]. In this case, there will be
vacant levels above εf in the valence band on the p-side and electrons below εf
in the conduction band on n-side. When the reverse bias is applied, many
electrons will move from the valence band on the p-side into the conduction
band on the n-side, giving rise to a large current [Fig. 8.11(b)]. For a small
forward bias, electrons from the conduction band on the n-side can move not
p-type n-type

ef

(a) (b)

I C

(c) (d)

Fig. 8.11 The characteristics of the tunnel or Esaki diode; (a) energy bands without
bias, (b) energy bands with reverse bias, (c) energy bands with forward bias,
(d) current as a function of V showing negative resistance between C and D.
286 Elements of Modern Physics

only into the conduction band on the p-side, but also tunnel into some of the
vacant levels on the p-side. This again gives rise to a large current. On the other
hand, if the forward bias is quite large, there will no longer be any overlap of
the valence band on the p-side and the conduction band on the n-side. The
resulting current which is due to electrons moving across the potential barrier
from the conduction band on the n-side to the conduction band on the p-side,
actually shows a decrease [Fig. 8.11(c)]. For still higher potentials, the current
will begin to increase again, as in the case of the ordinary diode. The main
characteristic of these tunnel or Esaki diodes is the negative-resistance section,
which is used in high-frequency oscillator circuits in the microwave region.

Transistor
An important application of semiconductor junctions is the transistor. It consists
of two semiconductor junctions close together, which serve as an amplifier of
current or voltage.
To be specific, consider a pnp transistor (similarly one can have an npn
transistor) which consists of three regions (Fig. 8.12). The first region is the
emitter, the small, narrow, middle region is the base, and the third region is the
collector. The equilibrium potential consists of a potential barrier in the n-region.
If now the base is connected to a small negative potential Vb (the common
emitter may be taken to be at Ve = 0), and the collector to a fairly large negative
potential Vc, the potential across the two junctions is modified [Fig. 8.12(b)].
There is a forward bias across the emitter-base junction for the holes, and the
current across the junction is (Eq. (8.74)]
I = I0 [exp (eVb/kT) – 1] (8.75)
Since the base is very narrow (less than 10 cm in width), most of the holes
–3

that enter the base from the emitter roll down into the collectotor, though a few
of them will be annihilated by the electrons in the base. Thus, a major part of
the emitter current flows through the collector while only a small fraction of it
flows out from the base. In a general way, the collector current is controlled by
the changes in the barrier introduced by Vb, and the changes in the base current
Ib are amplified into the changes in the collector current Ic. The amplification in
the current is estimated by
β = Ic/Ib
hole lifetime
= (8.76)
base transit time
(holes annihilated in the base contribute to Ib) which in practice has a value of
about 100.
Solid State Physics 287

Vc Vb

E B C Ic
p n p
Vc
Ib

Vb
(a) (b)

E B C
Ie Ic
p n p

Vi Ri Ro Vo

(c)

Fig. 8.12. Transistors, (a) common emitter circuit, (b) the potential
distribution, (c) common base circuit.
In the analysis so far the motion of only holes has been considered. Regarding
the electrons, there is hardly any electron flow from the collector to the base.
The flow of electrons from the base to the emitter is minimized by having
relatively small amount of doping of donors in the base. Thus the currents in
the pnp transistor, are mainly due to the motion of the holes. For an npn transistor,
the analysis is similar except that the signs of all the potentials are opposite and
the current in this case is due to the flow of the electrons.
The pnp transistor can also be used as a voltage amplifier by having a
common base [see Fig. 8.12(c)]. In this arrangement, the input potential across
Rin is amplified into the output potential across Rout. Arguments similar to those
given above imply that the emitter current Ic is approximately equal to the
collector current. Therefore, the voltages across Rin and Rout are given by
Vin = Ic Rin
Vout = Ic Rout (8.77)
It therefore follows that
Vout R
≈ out (8.78)
Vin Rin
Since Rout/Rin is usually quite large, voltage gains of the order of 500 are
quite usual. This arrangement amplifies both voltage and power.

Photodiodes
A photodiode is a pn junction used to convert radiation energy into an electric
current. Consider a photon of frequency v
288 Elements of Modern Physics

v ≥ (εc – εv)/h (8.79)


incident on the depletion region of a pn junction (see Fig. 8.9), or near it (within
about the diffusion length). This photon may be absorbed by an electron in the
valence band as a result of which it may move into the conduction band, thus
creating an electron and a hole. The electron and the hole are separated by the
electric field in the depletion region, the electron moving into the n-region and
the hole moving into the p-region. The direction of the resulting current Iv, is
that of the current produced by a reverse bias. If therefore an external potential
V is applied to the junction (V > 0 corresponds to forward bias), the current is
I = I0 [ecV kT – 1] – Iv (8.80)
where the first term is the current in the absence of radiation [see Eq. (8.743)].
If the circuit is open, I = 0, and an effective forward bias voltage V appears
at the terminals, given by
kT  I 
V= ln 1 + v  (8.81)
e  I0 
This voltage is essentially due to the accumulation of the excess photo-
electrons in the n-region and photoholes in the p-region (this reduces the potential
difference across the junction). The expression for the photovoltage in Eq. (8.81)
is valid for V ≤ εc – εv (for V = εc – εv, there is no longer a potential difference
across the junction to separate the electrons and the holes). If the terminals are
connected to an external load resistance R, a voltage V ′ somewhat less than V in
Eq. (8.81), appears across the junction, and the net current is
IL = I0 (exp (eV′/kT) – 1) – Iv (8.82)
The voltage across R is
VL = V ′ – | IL | Rc (8.83)
where Rc is the resistance of the solar cell. These relations together with
VL = ILR, alow us to determine V ′, VL and IL for given Iv, Rc and R. Thus a pn
junction can be used to convert radiation energy into electrical energy. This is
the principle behind the use of a pn junction in photometers, detectors and in
solar cells.
For using a pn junction as a photometer, the terminals are usually short
circuited i.e., V = 0. The resulting current [Eq. (8.80)] is – Iv. Its magnitude is
proportional to the intensity of the incident radiation, and hence it is used in
photometers for estimating the intensity of radiation. For the use of a photodiode
as a γ-ray or particle detector, it is noted that the energy of a photon in x-ray
γ-rays is much greater than the energy gap. Such a photon creates a highly
energetic electron and a hole. These produce other pairs and the process continues
till their energies are comparable to the energy gap. The number of carriers
indicated by the current gives a measure of the initial photon energy. It may be
noted that in order to collect the carriers quickly (collection time required is
Solid State Physics 289

about 10–8 s) and efficiently, the junction is subjected to a reverse bias so that
the carriers at the junction are subjected to a large potential difference. The pn
junction is also used as solid-state detector for measuring the energy of other
particles such as protons and electrons, the only difference in this case being
that these particles are not absorbed and come to rest after creating several
electron-hole pairs.
In a silicon solar cell [Fig. 8.13(a)] there is a very thin, large surface of
n-type silicon, forming a junction with a large volume of p-type silicon. The
surface is made large so as to collect a substantial amount of radiation, and thin,
about 10–6 m, so that the majority of the carriers generated by the photons can
diffuse to the junction before recombining. The surface is coated with an
anti-reflection coating to increase the efficiency which can reach a value of
about 16% in silicon solar cells (efficiency is the ratio of the electrical energy
output to the radiation energy input). When the surface is exposed, the holes
move across the junction to the p-side and the photodiode becomes an energy
cell with the n-side being at a negative potential and the p-side being at a positive
potential.



n –
– – –
p

hn

+ + + +
R +
+
(a) (b)

Fig. 8.13(a) Schematic diagram of a solar cell,


(b) a transition in a light emitting diode.

Light Emitting Diodes


A light emitting diode is a photodiode run backward. In a diode junction, there
is an excess of electrons in the conduction band of the n-side, and an excess of
holes in the valence band of the p-side, with a potential barrier across the junction
(Fig. 8.9). When a forward bias is applied, current flows across the junction. As
a result, electrons from the n-side are injected into the p-side and holes are
injected from the p-side into the n-side. Near the junction, the electrons in the
conduction band combine with the holes in the valence band with the emission
of photons of frequency
v = (εc – εv)/h (8.84)
In practice, these radiative transitions have to compete with nonradiative
recombinations due to impurities. While nonradiative recombinations dominate
290 Elements of Modern Physics

in Ge and Si, radiative recombinations are important in some semi-conductors


such as GaAs. In GaAs the energy gap is about 1.4 eV corresponding to a
frequency for λ ≈ 8900 Å which is in the infrared region. The gap can be increased
by alloying the material with phosphorus, i.e., by using Ga (As)1 – x Px which
can produce radiation in the optical region. This radiation can be extracted from
openings close to the junction. Light emitting diodes are used in display and
warning devices.

Semiconductor Diode Laser


If a pn junction is designed properly, a light emitting diode can produce laser
action.
In a semiconductor laser, the laser medium is usually a pn junction of a
semiconductor such as Ga(As)1 – x Px in which radiative recombinations dominate.
The opposite surfaces of the crystal, perpendicular to the junction, are taken
along the cleavage planes, and are polished so as to form an effective resonance
cavity. In this cavity, the photons emitted at the junction and moving in a
particular direction parallel to the junction plane, produce stimulated emission
giving rise to laser action (Fig. 8.14).

Laser beam p T
Junction plane
n

Polished surface

Fig. 8.14 Schematic diagram of a semiconductor laser.


The advantages of semiconductor lasers are that they are compact, efficient
and can be fabricated with ease. However, their monochromaticity, coherence
and directionality are inferior to those of other lasers.

Fabrication of Semiconductor Devices


The working of semiconductor devices depends on the ordered periodic
arrangement of atoms in a lattice and the introduction of impurities in a controlled
manner. A highly sophisticated technology is involved in their development.
The steps involved in their production are illustrated by considering the specific
case of a silicon npn junction.
In the first step, a silicon crystal is grown by dipping a seed crystal into the
molten silicon at about 1425°C, and slowly pulling it up at a rate of about 100
millimetres per hour. The melt contains a suitable amount of phosphorus to
produce new crystal layers of the n-type. The crystal is cut into thin slices about
0.25–0.5 mm in thickness and about 10 cm in diameter, by using a diamond
Solid State Physics 291

saw. The surface of the slice is polished mechanically and chemically, to give
what is referred to as the substrate.
In the next step, a mixture of silicon tetrachloride, hydrogen and phosphine
(PH3) is passed over the substrate at about 1200°C. The silicon released by the
reduction of silicon tetrachloride, along with a suitable amount of phosphorus
(from the PH3), crystallizes on the substrate surface forming what is known as
an n-type epitaxial layer (for producing p-type epitaxial layer PH3 is replaced
by diborane, B2H6).
The surface of the epitaxial layer is oxidized by heating the substrate to
about 1100°C in steam or oxygen so as to produce a thin layer of SiO2 [Fig.
8.15(a)]. The oxide surface is coated with a photosensitive material called the
photoresist. An area of the surface is covered with a photographic mask and the
rest of the surface is exposed to ultraviolet radiation. The photoresist is then
developed and the unexposed area is washed off. The oxide in this area is
removed by immersing in hydrofluoric acid and then the exposed photoresist is
removed. This procedure is known as window opening and it effectively removes
SiO2 from specified areas.
Boron

SiO2 SiO2
p
n-type epitaxial layer n

n-type substrate n

(a) (b)

Phosphorus

n Emitter
Base
n n-collector

(c) (d)

Fig. 8.15 Fabrication of an npn transistor: (a) oxidization, (b) boron


diffusion through the windows, (c) phosphorus diffusion through
the window, (d) metal contacts shown by shaded areas.
A p-type diffusion is introduced by using boron atoms, producing a p-type
layer [Fig. 8.15(b)] which essentially forms the base of the final npn transistor.
The surface is re-oxidized and a smaller window is opened (following the same
procedure as before) over the p-layer. An n-type of diffusion is now made to
292 Elements of Modern Physics

convert part of the p-region back to n-type to form the emitter [Fig. 8.15(c).
The surface is again oxidized and two new windows are opened to expose the
emitter and base regions and a metal (usually aluminium) is evaporated into
those windows forming electrical contact with these regions. The contact with
the original epi-layer which forms the collector, can be made through the
substrate [Fig. 8.15(d)].
Other devices can be produced by the variations of the essential steps
involved in the development of the npn transistor described.

Amorphous Semiconductors
There are many amorphous substances which have significant electrical
conductivity. They are known as amorphous semiconductors. In these materials,
the conduction is by electrons. They differ from the crystalline semiconductors
in that while they have short-range order, long-range order is absent in them.
This can be illustrated by amorphous Ge. In this case, though each atom is
surrounded by four nearest neighbours, the location of the second-nearest
neighbour is not unique. In the amorphous semiconductors, the different possible
locations of farther-away neighbours are almost randomly filled leading to
disorder at long range. The effect of this long-range disorder is not significant
for energy levels deep inside an energy band but is important for those near the
edges, e.g., those near the top of the valence band and bottom of the conduction
band. It leads to narrowing of the energy gap as compared with the gap in
crystalline semiconductors. Some amorphous semiconductors are Ge, Si, Se,
As2Se3, etc.
Amorphous semiconductors are used in switching and memory components,
They are also used in xerographic processes. Here, typically a thin film of
amorphous selenium is deposited on a metallic substrate, usually Al. It is charged
electrically by means of a discharge. When a pattern of light to be copied falls
on this, the lighted areas become photoconductive and discharge their charge
whereas the dark areas retain their charge. A finely-powdered pigment is sprayed
on the surface. It is retained by the charged areas and then transferred to a sheet
of paper.

8.6 MAGNETIC PROPERTIES


In this section, we discuss the magnetic properties of materials. They are related
to the spin and orbital angular momentum of the electrons, and are observed in
the form of (i) diamagnetism (ii) free-electron paramagnetism
(iii) paramagnetism of atoms, ions or molecules, (iv) ferromagnetism, and
(v) antiferromagnetism and ferrimagnetism. Of these the first three are rather
weak effects while the last two are strong effects which are also of great
technological importance.
Solid State Physics 293

The magnetic properties are discussed conveniently in terms of magnetic


susceptibility χ defined as
χ = M/H (8.85)
where M is the magnetic moment per unit volume, and H is the magnetic
intensity. The magnetic intensity H, the magnetic moment M and the magnetic
field B are related by
B = µ0(H + M) (8.86)
where µ0 is the vacuum permeability. Different magnetic properties, in particular
the behaviour of the associated magnetic moment and the magnetic susceptibility
will be discussed here.

Diamagnetism
When an atom is subjected to a magnetic field, the changing magnetic flux
induces currents (via the electron orbits) which, as per Lenz’s law, oppose the
change in flux. The currents persist, and have a magnetic moment which is
opposite in sign to the magnetic field intensity. The associated magnetic
susceptibility is negative and the property is known as diamagnetism.
Diamagnetism is present in all substances but is usually obscured by the larger
effects due to permanent magnetic dipole moments of the atoms.
Essentially, diamagnetism is the consequence of the term in which is
quadratic in B,

e2
Hd = ∑ (ri × B)2
8m i

e2
=
8m
∑ (ri ⊥ )2 B 2 (8.87)
i
where the summation is over all the electrons. In perturbation theory, the energy
due to this term is the average value

e2 2
E=
8m
B ∑ (ri ⊥ ) 2 (8.88)
i
The magnetic moment in this case is defined by
∂E
m0 = −
∂B
e2
= − B ∑ (ri ⊥ ) 2 (8.89)
4m i
The diamagnetic susceptibility, therefore, is
294 Elements of Modern Physics

e 2µ 0 N
χ= −
4m
∑ (ri ⊥ ) 2 (8.90)
i
where N is the number of atoms per unit volume. For the evaluation of this
quantity, an approximate value for 〈(ri ⊥ ) 2 〉 is normally used. For a typical value
of r2 ~ 10–20 m2, the molar susceptibility is
χm ~ 5 × 10–8/kg. mol, in MKS units, (8.91)
(multiply by 10 /4π to get the value in Gaussian units of per g mole) which
3

means that diamagnetism is a small effect. It is significant minly in atoms and


ions with closed shells, e.g., He, Ne, F–, Cl–, etc. which do not have a permanent
magnetic moment.

Free-Electron Paramagnetism
Free-electron paramagnetism in metals arises from the intrinsic magnetic
moment associated with the spin of the electron. In the absence of any magnetic
field, there is no preferred orientation of these magnetic moments. However, in
the presence of a magnetic field, the energies of the electron are perturbed by an
additional interaction

 e 
H = − − s ⋅ B (8.92)
 m 
and the resulting energy eigenvalues are

e
ε′ = ε ± B (8.93)
2m
ε being the unperturbed energy. The net magnetic moment is obtained by using
the Fermi-Dirac distribution:


2π V (2m)3/2 )  ( − e / 2m)
M=
h3
∫ 
1 + exp  ε + e B − ε  kT
  f 
 2m 


− e / 2m 
+  ε1/ 2 d ε (8.94)
 e 
1 + exp  ε − B − ε f  kT 
 2m  
For T → 0, this expression reduces to
Solid State Physics 295

2πV (2m)3 / 2  e  ε f + e B/ 2 m 1/ 2
M=  ∫ ε dε
h3  2m  ε f − eB/ 2 m
2
 NB 
3  e 
≈  
  (8.95)
2  2m  ε f (0) 
where Eq. (7.89) has been used. The susceptibility therefore is positive and
given by
2
3  e  µ0
χ≈ N  (8.96)
2  2m  ε f (0)
This is generally quite small and for sodium [εf(0) ~ 3.1 eV] the susceptibility
per unit mass 8.3 × 10–9 kg–1 in MKS units or 6.6 × 10–7 g–1 in Gaussian units.
On including the corrections due to exchange correlation and effective mass,
the value is 8.8 × 10–7 g–1 in Gaussian units, which should be compared with the
experimental spin susceptibility of 9.8 × 10 –7 g –1. For obtaining bulk
susceptibility, the diamagnetic susceptibility due to the free electrons and the
ions should also be included.
At finite temperature, there is a slight dependence of χ on T which, for all
practical purposes, may be neglected.

Paramagnetism
Atoms, ions and compounds with unpaired electrons (this is the case if the
number of electrons is odd and also for some systems with even number of
electrons), have a nonzero magnetic moment. In the presence of a magnetic
field, they align with the magnetic field and produce a net, macroscopic magnetic
moment, giving rise to paramagnetism. Since the atoms (most of the subsequent
discussion applies to ions and molecules as well) are localized, Boltzmann
distribution can be used for the electron states. This gives rise to a temperature-
dependent susceptibility.
As was discussed in Sec. 6.2 the energy due to the interaction of an atom
with a magnetic field is
e
∆E = g M J B, M J = − J , − j + 1, , J (8.97)
2m
where g is the landé g-factor [Eq. 6.13)] and MJ is the z-component of the total
angular momentum. Using Boltzmann distribution for the populations, the
magnetic moment per unit volume is
J
∑ M J exp (− aM J / kT )
e MJ = − J
M= −N g J
(8.98)
2m
∑ exp (− aM J / kT )
MJ = − J
296 Elements of Modern Physics

e
where a = gB and N is the number of particles per unit volume. The
2m
summations can be carried out to yield

 eg   d 
M = N   ln f ( x) 
 2m   dx  x = a / kT

 1    1 
exp  J +  x  − exp  −  J +  x 
 2     2 
f(x) = x/2 −x / 2
(8.99)
e −e

e
For B  kT , the expression leads to a susceptibility
2m
χ = M/H
2
e  J ( J + 1)
= N µ0   (8.100)
 2m  3kT
It is observed that the susceptibility is inversely proportional to temperature.
This is stated in the form
χ = C/T (8.101)
known as Curie’s law, where C is called the Curie constant.
At T ≈ 300 K, molar susceptibility χm is of the order of 5 × 10–7/kg mol in
MKS units (4 × 10–5/gm mol in Gaussian units), which is rather small, but it
becomes much larger at low temperatures. The pedictions of Eq. (8.100) with
the J values given by Hund’s rule (ground state has the largest S allowed by
Pauli principle, the maximum L consistent with this S, and J = L + S when the
shell is more than half full and J = |L – S| otherwise, are generally in good
agreement with the experimental observations for many paramagnetic crystals,
e.g., rare earth ions, where in some cases the effect of the nearby states has to be
included.
The predictions of Eq. (8.100) are not in good agreement with experimental
observations for the ions of the iron group. The reason for this is that the partially
filled 3d shell for these ions is the outermost shell and is exposed to the strong
field due to the neighbouring ions in the crystal. This field, called the crystal
field, breaks the rotational symmetry, and the total angular momentum is no
longer a ‘good’ quantum number. Furthermore, the average value of Lz may
reduce to zero. This effect is known as the quenching of the orbital angular
momentum and implies that Eq. (8.97) should be replaced by
Solid State Physics 297

e
∆E = M S B ( g = 2 for spin) (8.102)
2m
This leads to
2
 e  S ( S + 1) eB
χ = N µ0   ,  kT (8.103)
 2m  3kT m
for the ions of iron group. As an example, in the case of χ for Mn3+ (5D0),
Eq. (8.100) predicts that χ = 0, whereas the prediction of Eq. (8.103) with S = 2
is in very good agreement with experimental observations.
It may be noted that adiabatic demagnetization of a paramagnet system can
be used for attaining low temperatures, T < 1 K. This is done as follows. A
magnetic field is applied to a paramagnetic substance in good thermal contact
with the surroundings at T1. The field aligns the magnetic moments along the
direction of the field. This increase in order is equivalent to a decrease in the
entropy and hence heat flows out of the system. If now the substance is insulated,
and the field removed adiabatically, the spins gradually get out of the alignment
by absorbing energy from the lattie vibration which leads to a lowering of the
temperature of the paramagnetic substance. Temperatures of the order of 10–3 K
have been reached by this method.

Ferromagnetism
Ferromagnetism is the phenomenon in which some materials like iron, cobalt,
nickel, and some of their alloys behave like ordinary paramagnets at high
temperatures but which below a critical temperature known as the Curie
temperature Tc, acquire a nonzero magnetic moment even in the absence of an
applied magnetic field. This is due to the interaction between the magnetic
ions, which is strong enough to align their magnetic moments against the disorder
introduced by thermal effects.
The interaction that aligns the magnetic moments is quantum mechanical
in origin and is due to the exchange properties of the electron wave functions.
When the wave functions of two atoms overlap, the electrons being
indistinguishable, belong to both the atoms. In such cases, the symmetry or the
antisymmetry of the wave functions will strongly influence the energy of the
system (as in the case of covalent bonding, see Chapter 5). In particular, it is the
exchange symmetry between the spins and the extent of the overlap of the wave
functions that determines the nature and the strength of the exchange interaction.
It is reasonable to represent the energy from the exchange interaction by
298 Elements of Modern Physics

E = − ∑ J ij Si ⋅ S j , i ≠ j (8.104)
i, j

where Si is the spin of the i-th atom, and Jij are symmetric constants. If the
magnetic moment is assumed to be due to spin alone, Mi = b Si, as is the case
for the iron group, the interaction energy of the i-th atom can be written as
Ei = – Mi ⋅ Bint (8.105)
where Bint is given by
1
Bint =
b2
∑ J ij M j , i≠ j
j

=λM (8.106)
i.e., the effective internal field is proportional to an average magnetic moment M.
It is this field, known as the Weiss field, which is responsible for the alignment
of the spins. In the case of ferromagnetic substances, Jij are quite large and λ is
positive, which gives rise to ferromagnetism.
Consider the behaviour of ferromagnets above the curie temperature Tc.
Writing
B = B0 + λM (8.107)
where B0 is the applied field, one gets from Eq. (8.99),
2
 eg  J ( J + 1) e
M = N  ( B0 + λM ), B  kT (8.108)
 2m  3kT 2m
This leads to
CB0 /µ0
M= (8.109)
T − Tc
C
c= (8.110)
T − Tc
2
 eg  J ( J + 1)
with Tc = N   λ
 2m  3k
µ 0Tc
= (8.111)
λ
The expression in Eq. (8.110) is known as the Curie-Weiss low and Tc is
known as the Curie temperature. The behaviour of χ given in Eq. (8.110) is
valid for T > Tc. At T = Tc, χ becomes infinite. Since M is finite, this implies that
M is nonzero even when B0 = 0, i.e., spontaneous magnetization exists. The
Curie temperature is about 1043 K for Fe, 1400 K for Co, and 631 K for Ni.
Solid State Physics 299

Below T = Tc, the spontaneous magnetization is to be obtained by using the


complete expression in Eq. (8.99) for M, with

eg
a= ( B0 + λM ) (8.112)
2m
The solutions are obtained by plotting M in Eq. (8.99) as a function of x,
and also

 2mkT 
M=   x, (8.113)
 eg λ 
obtained from Eq. (8.112) with a = xkT, B0 = 0, and looking for the intersection
of the two curves [see Fig. 8.16(a), it can be shown that the intersection at the
origin gives an unstable solution]. At T = Tc the curve given by Eq. (8.113) is
tangential to the curve given by Eq. (8.99), at the origin, and there is no
spontaneous magnetization for T > Tc. When T < Tc, there are two equal and
opposite solutions for each T, for example corresponding to points A and A′ in
Fig. 8.16(a). One set of these spontaneous magnetizations is plotted as a function
of T (T < Tc) in Fig. 8.16(b).
For B0 ≠ 0, the magnetization is obtained from the intersection of the curve
given by Eq. (8.99) and

 2mkT  1
M=   x − B0 (8.114)
 eg λ  λ
obtained from Eq. (8.112) with a = xkT. There are two solutions for
M corresponding to intersections at D and D′ in Fig. 8.16(a), for each B0 (T <
Tc). These solutions trace the boundary of the hysteresis curve [Fig. 8.16(c)]. It
can be shown that the third solution corresponding to intersection F in Fig.
8.16(a) is unstable. This solution is the extension of the unstable solution at the
origin for B0 = 0.
At T = 0, all the spins and the magnetic moments of the atoms are aligned
corresponding to the ground state of the system. The direction of the alignment
is introduced by the arbitrarily assumed direction of the internal field Bint, and
obviously the ground state is infinitely degenerate.
For T > 0 K, some of the spins go out of alignment. As in the case of lattice
vibrations, the disturbances are correlated, and misalignments travel as waves
known as spin waves. In analogy with photons and phonons, the excitations of
spin waves are quantized into quanta known as magnons. Magnons, which obey
Bose-Einstein statistics, play a significant role in determining the behaviour of
M(T ) at low temperatures, and contribute to the specific heat and thermal
conductivity of ferromagnets.
300 Elements of Modern Physics
M
2m kT 1
M = eh gl x – l Bo
A
D Nehg f¢(x)
.M = 2m f(x)

x
F

(a)

M
D
A

0
M(O)
M(T)

H = Bo/mo

0
0 1
T/Tc D¢

(b) (c)

Fig. 8.16 Ferromagnetism, (a) determination of M for B0 = 0 and B0 ≠ 0,


(b) spontaneous magnetization as a function of T for T < Tc,
(c) M as a function of H for T < Tc.
An important property of ferromagnets is that for T < Tc, they are usually
not magnetized. They become magnetized under the influence of an external
magnetic field. This behaviour is explained by postulating that a ferromagnet is
usually subdivided into what are known as domains. Each of these domains is
spontaneously magnetized but the direction of magnetization may be different
in different domains. This situation is energetically favourable since in addition
to the energy of the atoms, energy is stored in the magnetic field also,
E = 12 H ⋅ B, and this energy is reduced if the alignment changes from domain
to domain. The division into smaller and smaller domains is ultimately restrained
by the fact that the formation of domain walls requires additional energy. If the
ferromagnet is subjected to an external field, the domain walls move in such a
way that the domains with magnetization nearly parallel to the external field
grow. For small fields, this movement is reversible. However, if the external
field is sufficiently strong, the walls may move irreversibly over potential barriers
Solid State Physics 301

and finally all the domains will have magnetization along a preferred or an
‘easy’ direction (determined by the crystal structure) nearly parallel to the
direction of the external field. A further increase in the field brings the alignment
closer to the direction of the external field. The progress of magnetization as
the external field increases is described by the broken line in Fig. 8.16(c). If the
external field is now removed, some of the magnetization is retained [point A in
Fig. 8.16(c)]. The variation of magnetization with the external magnetic field
now produces the well-known hysteresis curve, similar to that in Fig. 8.16(c)
except that usually B is plotted against H. The magnetization can be destroyed
either by heating or by mechanical shocks. The reality of the domains, which
are so successful in describing ferromagnets, can be demonstrated by scattering
finely divided iron on the surface of a ferromagnet, which collects along the
domain boundaries where the field is the strongest.

Antiferromagnetism and Ferrimagnetism


In most substances other than ferromagnets, the exchange energy in Eq. (8.104)
is positive when the neighbouring spins are parallel. This leads to a ground
state in which the neighbouring spins are antiparallel and hence to a cooperative
alternating alignment below temperature TN known as the Néel temperature.
The problem can be analysed in terms of alternating lattice points occupied by
atoms A and B with antiparallel spins.
Let the average magnetic moments of A and B sets of atoms be MA and MB
respectively. The total magnetic fields at sites A and B (for antiparallel alignment)
are:
BA = B0 – λAMB,
BB = B0 – λBMA (8.115)
Substituting these expressions in Eq. (8.99), for temperature above the Néel
temperature, the magnetic moments are:

CA
MA = ( B0 − λ A M B ), (8.116)
T
CB
MB = ( B0 − λ B M A )
T
2
 e g A, B  J A, B ( J A, B + 1)
where CA,B = N A, B   (8.117)
 2m  3k
If the atoms A and B are magnetically equivalent, then
χ = (MA + MB)/H
302 Elements of Modern Physics

2µ0C A
= , T > Tc (8.118)
T + Tc
with Tc = λACA, λA = λB, CA = CB, where Tc is the Néel temperature TN. At T = Tc,
an additional solution to Eq. (8.116) exists, MA = – MB even for B0 = 0. Thus, the
sets of atoms A and B are spontaneously magnetized for T < Tc though the net
magnetization is zero. Such materials are known as antiferromagnets. In
antiferromagnets, the crystal field aligns the magnetic moments along a preferred
direction below Tc. It can be shown that a field applied perpendicular to this
direction is associated with a susceptibility χ⊥ which is essentially independent
of T(T < Tc) whereas a field applied parallel to the preferred direction, is associated
with a susceptibility χ11 which is equal to χ⊥ at T = Tc but decreases to zero as
T → 0 K.
If the atoms A and B are magnetically inequivalent, Eq. (8.116) can be
solved for MA and MB and they lead to a susceptibility

µ0 [T (C A + CB ) − C ACB (λ A + λ B )]
χ= (8.119)
(T − Tc ) (T + Tc )
where Tc = (CACBλAλB)1/2. Thus, χ tends to infinity as T → Tc, which implies that
there is a net spontaneous magnetization for T < Tc (MA and MB are opposite in
sign but MA + MB ≠ 0). Such meterials are called ferrimagnets, Fe3O4 being a
well known example. An important class of ferrimagnets is the ferrites which
have the formula M2+ Fe23 +O42– where M is a member of the first transition
group. They are of great technical importance since they may have large
magnetization at room temperature, and high resistivity. They are therefore more
suitable than ferromagnets for use at high frequencies when eddy current losses
are a serious problem. They are also used for memory storage in computers.

8.7 DIELECTRIC PROPERTIES


In this section, the properties of solids in the presence of an external electric
field are discussed. These are important in the propagation of electromagnetic
waves in material media, and in the development of many devices such as
capacitors, microphones, etc.
An electric field E causes a relative displacement of the positive and negative
charges of a material. This induces an electric dipole moment which is expressed
in terms of the polarization P defined as the dipole moment per unit volume.
For ordinary electric fields, the polarizability is linear in E (it is nonlinear for
strong laser fields),
Solid State Physics 303

P = ε0 χe E (8.120)
where ε0 is the permittivity of the vacuum and χe is the electric susceptibility. It
is convenient to define an atomic polarizability α by
P = N α Eloc (8.121)
where N is the number of atoms per unit volume and Eloc is the effective field at
the atom, not including the field due to the atom itself. A displacement vector
D can also be defined as
D ≡ ε ε0 E
≡ ε0 E + P (8.122)
where ε is the dielectric constant. It can be shown that for a dielectric material
with an isotropic or cubic (including simple cubic, bcc, fcc) structure, filling a
parallel plate capacitor,

1
Eloc = E + P (8.123)
3ε0
This allows us to eliminate Eloc in Eq. (8.121). Solving for P from
Eqs. (8.121) and (8.122), and equating the two expressions gives
Nα ε −1
= (8.124)
3ε0 ε+2
This is the Clausius-Mossotti formula, relating the atomic polarizability α
to the macroscopic dielectric constant ε.
For a nonmagnetic material, ε = n2, n being the refractive index, so that

n2 − 1 1
2
n +2
=
3ε0
∑ N i αi (8.125)
i

where a summation over i has been introduced to include the possibility the
several mechanisms contribute to the polarizability. For the polarizability of an
atom, contributions are electronic, ionic and orientational. It is possible to
experimentally separate the different contributions by observing the polarizability
as a function of frequency and by noting that the different contributions are
generally significant in different ranges. This is illustrated in Fig. 8.17(a). The
rapid changes in the polarizability are also accompanied by large absorption of
radiation [Fig. 8.17(b)].

Electronic Polarizability
The electronic contribution to polarizability arises from the displacement of
electrons in an atom, relative to the nucleus.
304 Elements of Modern Physics

aorientational
a
aionic

aelectronic

(a)
Absorption

n
(b)

Fig. 8.17 Schematic diagram illustrating the variation of


polarizability and absorption as a function of v.
Consider the behaviour of a bound electron in the presence of an
electromagnetic field E cos (ωt) in the z-direction. It was pointed out in Sec. 6.3,
that this introduces a potential
V = e E z cos ωt (8.l26)
as a result of which the state changes to the expression in Eq. (6.41) with am (t)
given by Eq. (6.48). The resulting dipole moment is given by
p = – e〈ψ(t) |z| ψ(t)〉 (8.127)
which on using Eqs. (6.41 and 6.48), after some analysis, gives, to leading
order in E

e2  1 1 
p≈ E cos ωt ∑ | z j 0 |2  + (8.128)
  ω j 0 + ω ω j 0 − ω 
j  
where ωj0 = (Ej – E0)/. From this, the polarizability is

e2 1  1 
α= ∑ | z j 0 |2  ω
+  (8.129)
j   j0 + ω ω j0 − ω 
It is observed that α takes sudden jumps whenever ω = (Ej – E0)/. Also at
ω = (Ej – E0)/, there are transitions to the state j, resulting in an absorption of
Solid State Physics 305

radiation. Actually the singularity at ω = ωj0 is displaced by the fact that the
state j is an unstable state which essentially requires a replacement of
(ωj0 – ω)–1 by the real part of (ωj0 – ω – i/2 τj)–1, i.e.,

1 ω j0 − ω
→ (8.130)
ω j0 + ω (ω j 0 − ω) 2 + (2 τ j ) −2
where τj is the lifetime of state j [see Eq. (6.76)]. With this modification,
Eq. (8.129) provides a qualitative explanation of the polarizability illustrated in
Fig. 8.17(a). It may also be noted that when ωj0 ≈ ω, there is a significant
probability for transition to state j, as seen from the expression in Eq. (6.55) for
aj, which leads to absorption of radiation [Fig. 8.17(b)]. While Eq. (8.129) is
valid for a general charged system, it is practically useful mainly for electronic
polarizability for which zj0 can be calculated with some reliability. A rough
order of magnitude estimation for this electronic polarizability gives

(1.6 × 10−19 )2 (10−10 ) 2


α= ≈ 1.6 × 10–39 F.m2 (8.131)
(1.6 × 10−19 )

Ionic Polarizability
The ionic polarizability is due to the displacement of ions with respect to each
other. If it is assumed that the forces near equilibrium are simple harmonic, the
displacement in the presence of an electric field is given by
k ∆ x ≈ eE (8.132)
where k is the force constant. This leads to a polarizability
α ≈ e2/k (8.133)
Since k ≈ 20 N/m, αionic ≈ 10–39 F.m2
The ionic contribution is important at low frequencies (ωj0 in Eq. (8.129) is
small). This explains the fact that NaCl has ε ≈ 5.6 at low frequencies whereas
at optical frequencies ε ≈ 2.25. The difference may be ascribed to the ionic
contribution to polarizability (see Fig. 8.17).

Orientational Polarizability
Molecules with permanent electric dipole moment align themselves in the
presence of an external electric field giving rise to an orientational polarizability.
The energy of a dipole p in an electron field E is
V = – p E cos θ (8.134)
where θ is the angle between the dipole and the field. Therefore, the average
dipole moment (using Boltzmann distribution) is
306 Elements of Modern Physics

 pE cos θ 
∫ cos θ exp  kT
 d cos θ

p = p (8.135)
 pE cos θ 
∫ exp  kT
 d cos θ

d  ea − e − a 
= p  ln   
 da  a   a = pE
kT
For pE << kT, p ≈ p2E/3kT which leads to
α = p2/3kT (8.136)
which at room temperatures is of the order of 10–39 F.m2, comparable to the
electronic polarizability. It is distinguished by its temperature dependence, and
suggests a relation
αtot = α0 + p2/3kT (8.137)
This expression is quite useful in determining dipole moments of dipolar
substances, e.g., HCl (p = 1.1 debyes, 1 debye = 10–39 C.m), by looking at the
temperature dependence of αtot. It is tempting to substitute Eq. (8.136) in
Eq. (8.124) to obtain
3 Tc Np 2
ε = 1+ , Tc = (8.138)
T − Tc 9ε o k
which would imply that spontaneous polarization (E = 0) sets in at T = Tc. It has
however been shown by Onsager that Eqs. (8.123) and (8.124) are not valid for
permanent electric dipoles. The theory of Onsager for permanent dipoles does
not imply the existence of a critical temperature for such dipoles.

Ferroelectric Crystals
The phenomenon of spontaneous polarization (E = 0), known as ferroelectricity
is observed in (i) Rochelle salt and some of the associated salts, (ii) some crystals
with hydrogen bonds, in which the motion of protons gives rise to ferroelectric
behaviour (e.g., KH2PO4, RbH2PO4, etc.) and (iii) ionic crystals with perovskite
(CaTiO3) and ilmenite (FeTiO3) structures. The perovskite structure illustrated
by BaTiO3 is the simplest structure which exhibits ferroelectricity—it has a
cubic structure with Ba at the corners, oxygen at the face centres and Ti at the
body centre. Ferroelectricity in barium titanate (BaTiO3) is briefly described
here.
Barium titanate becomes ferroelectric at 380 K and exhibits hysteresis curves
for T < Tc, in the plot of D against E. The ferroelectricity in BaTiO3 is due to
induced electronic and ionic dipole moments. From Eq. (8.124)
2
1+
3ε0
∑ Ni αi
i
ε= (8.139)
1
1−
3ε0
∑ N i αi
i
Solid State Physics 307

1
Now, the contribution of electronic polarizabilities to
3ε0
∑ N i αi is about
i
0.61. Assuming that the ionic contribution is about 0.39 (estimations show that
this is not unreasonable), the dielectric constant tends to infinity. Expanding
the denominator as a function of temperature gives

3/ β 1  ∂ 
ε=
T − Tc
,β = −
3ε 0
 ∑ Ni αi  (8.140)
 ∂T i T = Tc
Estimates of β agree well with experimental observations, i.e., 3/β ~ 105 K.
It may be observed that the (incorrect) expression in Eq. (8.138) for dipolar
atoms would have given a value of 3Tc ~ 1140 K for the residue, considerably
smaller than the observed residue, which again rules out the explanation in
terms of dipolar atoms.
It may also be noted that (i) the description of the hysteresis, etc. in terms
of domains is valid for ferroelectricity, and (ii) antiferroelectricity is observed
e.g., in WO3, PbZrO3, etc.

Piezo-electricity
Some crystals when deformed by an external stress develop a net dipole moment
which produces surface polarization charges. This is known as piezo-electricity.
Piezo-electric materials exhibit the converse effect as well, i.e., they are distorted
when placed in an electric field. The strain produced however is very small. For
example in quartz which is the most common piezo-electric substance, an electric
field of 104 V/m produces a strain of only 1 part in 108. Of course, this also
means that even a small strain can produce enormous electric fields.
When a crystal is subjected to a strain, there is a displacement of the ions in
the crystal. If the charge distribution in the crystal does not have inversion
symmetry about a centre, a net polarization of charges may develop giving rise
to piezo-electricity. For example, an equilateral triangle with + 3 charge at the
centre and – 1 charge each at the vertices will have zero dipole moment. Under
strain, the bond lengths may remain the same but make unequal angles with
each other giving rise to nonzero dipole moment.
A part from quartz, other examples of piezo-electric materials are Rochelle
salt, barium titanate (BaTiO3), etc. In fact, all ferroelectric materials are piezo-
electric though the converse is not true. Piezo-electric materials are used to
convert electrical energy into mechanical energy and conversely, i.e., as
transducers. In particular, they are used in devices such as gramophone pickups,
microphones, strain gauges, etc. while the converse effect is used in ultrasonic
generators.
308 Elements of Modern Physics

8.8 EXAMPLES
Here, some examples that illustrate and extend the ideas about the solid state
are considered.

Example 1
The bulk modulus of a crystalline solid can be estimated from Eq. (8.5).
A pressure P produces a decrease in length, ∆l,
P = C l0 ∆l (8.141)
where C is a constant, so that the work done is

W = 3CV0 ∫ x dx
0

3
= C V0 (∆l )2 (8.142)
2
where V0 is the volume. This causes a change in energy given by
∆E = N[E(R0 – ∆R) – E(R0)]

N ∂2 E
≈ 2 2
(∆R) 2 (8.143)
∂R R = R0

N being the number of ion pairs, and where the fact that E is a minimum
at R0 has been used. Equating W and ∆E gives
2
N  R0  ∂ 2 E
C=   (8.144)
3V0  l0  ∂R 2 R = R0

Therefore, the bulk modulus is

P 1N  ∂2 E
K= − =   R02 (8.145)
∆V/V0 9  V0  ∂R 2 R = R0

where N/V0 is the number of ion pairs/unit volume, and E is given in Eq. (8.3)
with n ≈ 10. For NaCl, the above expression gives an estimate of about
3.5 × 1010 J/m3 (experimentally it is about 3 × 1010 J/m3).

Example 2
It can be shown that when a, b and c are mutually orthogonal, the distance
between the planes with Miller indices (h, k, l) is given by
Solid State Physics 309

−1/ 2
 h2 k 2 l 2 
d =  2 + 2 + 2  (8.146)
a b c 
Let one of the planes have intercepts n1a, n2b, n3c along the three axes
(where n1, n2, n3 are integers). A translation by an integral multiple of a, or b, or
c, along the first, or the second, or the third axis, respectively, gives an equivalent
plane. It is found that the number of equivalent planes between the origin and
this plane is equal to the l.c.m. of n1, n2, and n3, say N. On the other hand, the
Miller indices are
N N N
h=,k = ,l = (8.147)
n1 n2 n3
If D is the perpendicular distance of the plane with intercepts n1a, n2b, n3c
from the origin, then
D = n1a cos α = n2b cos β = n3c cos γ (8.148)
where α, β and γ are the angles made by the perpendicular line with the three
axes. Since
cos2 α + cos2 β + cos2 γ = 1, (8.149)
− 1/ 2
 1 1 1 
D=  2 2 + 2 2 + 2 2 (8.150)
n a n2 b n3 c 
 1
Using Eq. (8.147), the separation between two adjacent planes comes out
to be
− 1/ 2
 h2 k 2 l 2 
d = D/N =  2 + 2 + 2  (8.151)
a b c 

Example 3
Hall effect provides a convenient method of determining the nature of charge
carriers and their nonability.
Consider a current flowing in the x-direction, through a thin sheet in the xy
plane. If a magnetic field Bz is applied to the current, the charge carriers are
deflected by the v × B force and build an electric field Ey in the y direction. In
the equilibrium condition
Ey + (v × B)y = 0 (8.152)
or
Ey = vx Bz (8.153)
Now the current in the x-direction is nqvx, n being the carrier density and q
their charge, so that the Hall coefficient is
310 Elements of Modern Physics

RH ≡ Ey/JxBz
= 1/nq (8.154)
Thus, a measurement of Ey for a given Jx and Bz allows us to determine the
concentration of the carriers as well as the sign of their charge. From
Eq. (8.153) vx and hence the mobility of the carriers can also be determined as:
µ = vx/Ex (8.155)

Example 4
A silicon crystal contains an arsenic concentration of 1.2 × 1022/m3 and a boron
concentration of 6 × 1021/m3. What is the density of majority and minority carries
at room temperature?
The electrons in the conduction band and in the acceptor levels, are from
the donor levels and the valence band:
nc + na = hd + hv (8.156)
This leads to
Na
c0(me*T)3/2 exp [(εf – εc)/kT] +
exp [(ε a − ε f ) / kT ] + 1
Nd
= c0 (mh*T)3/2 exp [(εv – εf)/kT] + (8.157)
exp [(ε f − ε d ) / kT ] + 1
where c0 = 2(2πk)3/2/h3, Nd = 1.2 × 1022 m–3 and Na = 6 × 1021 m–3. Assuming the
εf – εa >> kT and εd – εf >> kT, and neglecting the first term on the rhs, gives
 Nd − Na 
εf ≈ εc + kT ln  3/ 2 
(8.158)
 c0 (me* T ) 
Therefore nc ≈ Nd – Na
1
hv ≈ c02 (me* mh* T2)3/2 exp [– (εc – εv/kT] (8.159)
Nd − Na
where at room temperature, me* ≈ 0.25 m, mh* ∗ 0.3 m, and for the given
concentrations,
εf – εc ≈ – 0.24 eV (8.160)
This result shows that the neglect of the first term on the rhs of Eq. (8.157)
is justified.

Example 5
The effective masses of the carriers are determined from cyclotron resonance
experiments. A resonance is observed in a Si crystal at 3 × 1010 Hz and a field of
0.4 T. What is the value of m*?
Solid State Physics 311

The resonance condition is


q v B = m* ω v (8.161)
which gives
m* ≈ 0.37 m (8.162)
where m is the electron mass.

Example 6
A germanium pn junction has 5 × 1022 phosphorus atoms/m3 in the n-side and
3 × 1022 gallium atoms/m3 in the p-side. What is the potential difference across
the junction at room temperature? If the current for a large reverse bias is
5 × 10–8 A, what is the current for a forward bias of 0.4 V?
Assuming complete ionization of donor atoms and occupation of acceptor
levels, one has the relations
c0 (me* T)3/2 exp [(εf – εc)/kT] = Nd (8.163)
c0 (mh* T) exp [(εv – εf*)/kT] = Na
3/2
(8.164)
with Nd = 5 × 10 m , Na = 3 × 10 m , and me* ≈ mh* ≈ 0.1m. This gives
22 –3 22 –3

εc – εf ≈ 0.072 eV and εf′ – εv ≈ 0.085 eV. Hence the potential difference across
the junction is
V0 ≈ 0.72 – 0.072 – 0.085
= 0.56 V (8.165)
Since a large reverse bias gives a current of 5 × 10 A, the forward bias
–8

current is
I ≈ 5 × 10–8 (exp [e∆V/kT] – 1) (8.166)
which for ∆V = 0.4 V gives I ≈ 0.24 A.

Example 7
The diamagnetic susceptibility of helium can be estimated from the approximate
helium wave function (see Example 5 of Sec. 5.8)

 1 
ψ= 
 π a′3  exp [− (r1 + r2 ) / a′]
(8.167)
 
a′ = 4πε02/me2Z′, Z′ = 27/16. The diamagnetic susceptibility is obtained
from Eq. (8.89) to be

e 2µ 0 N 2
χ=– a′ (8.168)
m
312 Elements of Modern Physics

which comes out to be 2.1 × 10–8/kg mole (1.67 × 10–6/g mol in Gaussian units,
compared to the experimental value of 1.9 × 10–6/g mol).

Example 8
A ferromagnetic material with J = 3/2 and g = 2 has a transition temperature
Tc = 120 K. Calculate the internal field near 0 K. What is the ratio of magnetization
at 300 K for B = 5 × 10–3 T compared to that at 0 K?
From the expression for Tc in Eq. (8.111), and

 eg 
Bint = λ  N J  at 0 K,
 2m 

6mkTc
one has Bint = (8.169)
eg ( J + 1)
which is about 108 T. The ratio of magnetization at 300 K to that at 0 K, is

B0 ( J + 1)  eg 
R=  
3k (T − Tc )  2m 
= 3.1 × 10–5 (8.170)
which illustrates the fact that paramagnetic effects are, in general, much smaller
than ferromagnetic effects.

Example 9
When a photon is incident on a material with an energy gap ∆E, an electron in
the valence band may absorb this radiation and go to the conduction band if
hv > ∆E. The kinetic energy of the electron is given by
1 2
mv = h v – ∆E – (εv – ε) (8.171)
2
where ε is the initial energy of the valence electron, the maximum energy being
observed for ε = εv. Thus, a rapid increase is observed in the absorptivity of
radiation as v increases through the value of v = ∆E/h. This property is used to
determine the energy gap (the experiments are usually done at low temperatures
to reduce thermal effects). Since ∆E ~ 1 eV for semiconductors, they are
essentially transparent to infrared radiation but absorb most of the radiation in
the optical region.
The excited electrons are de-excited either immediately, in general emitting
radiation (fluorescence) of a different frequency than that of the original photon,
or wander around in the crystal until they are trapped at the luminescent centers
Solid State Physics 313

(usually produced by adding chemical impurities, e.g., zinc sulphide). The


trapped electrons then are de-excited, with the emission of radiation after a
time delay (phosphorescence). Fluorescence and phosphorescence are together
known as luminescence and are used in fluorescent lamps, television picture
tubes, etc.

PROBLEMS
1. It requires an energy of 5.14 eV to remove the valence electron of Na and
an energy of 3.80 eV is released when an electron is added to Cl. Assuming
a value of n = 10 in Eq. (8.3), and an interatomic spacing of 2.82 Å, obtain
the cohesive energy/ion pair, and the repulsive energy.
2. Show that the Madelung constant for an infinite array of alternating positive
and negative charges in 1-dimension is α1 = 2 ln 2. Show that the
expressions in 2- and 3-dimensions are:

(− 1) n + m
α2 = 2α1 – 4 ∑ 2 2 1/ 2
n, m = 1 (n + m )


(− 1)l + m + n
α3 = 3α2 – 3α1 – 8 ∑ 2 2 2 1/ 2 .
l , m , n = 1 (l + m + n )

3. If the repulsive force is of the form Ce–r/a, determine C and a for NaCl
if the cohesive energy/ion pair is 6.61 eV, and the interatomic separation
is 2.82 Å.
4. It is observed that x-rays of wavelength 1.2 Å produce a first order
maximum at a Bragg angle of 12.3° when reflected by the (1, 0, 0) planes
of NaCl (which has fcc structure as in Fig. 8.1). If the density of NaCl is
2.165 g/cm3 and its molecular weight is 58.454, obtain the value of
Avogadro’s number.
5. For a cubic crystal of unit length 10–10 m, at what angles will the first
order maxima be observed for (1,1, 1), (1, 1, 0) and (1, 0, 0) planes? The
incident x-ray has a wavelength of 1 Å. Will the second order maxima be
observed? Will be (2, 1, 0) planes produce maxima in this case?
6. Hard spheres of radius R are arranged in contact in simple cubic, bcc and
fcc structures. Find the radius of the largest sphere that can fit into the
largest interstices of these structures.
7. Iron undergoes a phase transition from bcc (at lower temperature) to fcc
at 1180 K. If there is no change in the density show that the ratio of the
nearest neighbour separation increases by a factor of about 1.029.
314 Elements of Modern Physics

8. If an element contains both electron and hole carriers, show that the Hall
coefficient is given by

nhµ h2 − neµe2
RH =
e(nhµ h + neµe )2
A material has 1021 electrons/m3 and 5 × 1020 holes/m3. If µe = 0.05 and
µh = 0.07 in MKS units, evaluate the conductivity and the Hall coefficient
of the material.
9. Hall coefficient of A1 is – 0.3 × 10–10 MKS units. How many conduction
electrons does each atom contribute?
10. The conductivity of germanium is 0.7Ω–1m–1 at 0° C and 2Ω–1m–1 at 20°C.
What is the energy gap for germanium?
11. A measurement of 0.1% change in resistivity is possible in a silicon crystal.
What is the sensitivity at room temperature of such a crystal used as a
thermistor?
12. What is the conductivity at room temperature of (i) pure silicon (ii) silicon
containing 10–5% of phosphorus? Mobility of electrons is 0.14m2/Vs, that
of holes is 0.05 m2/Vs and the number of charge carriers in pure silicon is
about 2 × 1016 m–3, at room temperature.
13. A current of 10–4 A flows when a forward bias of 0.2 V is applied at room
temperature. Obtain the currents if (i) forward bias of 0.4 V (ii) reverse
bias of 1 V, are applied.
14. Show that the width of the depletion region at a pn junction, when a forward
bias of V is applied, is given by
1/ 2
 2κε0 (ne + nh ) (V0 − V ) 
x=  
 ene nh 
where V0 is the equilibrium potential difference across the junction. What
is the width for a junction with ne = 1022 m–3, nh = 1022 m–3, κ = 14, V0 = 0.7 V,
if a reverse bias of 2 V is applied?
15. Show that the product of electron and hole carriers is
nenh = c02(mh*me*T2)3/2 exp [(εv – εc)/kT],
independent of the concentration of the impurity. Thus, if an n-type of
impurity is introduced, the number of holes decreases.
16. A silicon semiconductor is doped with 9 × 1021 donors/m3 and 4 × 1021
acceptors/m3. If the electron mobility is 0.15 m2V–1s–1, estimate the
resistivity at room temperature.
Solid State Physics 315

17. Neglecting the number of holes in the valence band of an n-type


semiconductor, show that

 1
1/ 2
1
εf ≈ εd + kT ln  c1 +  − 
 4 2 

N d h3
where c1 = exp [(εc – εd)/kT]. This expression gives the
2(2πkme* T )3 / 2
correct expression at T → 0 as well as at room temperature.
18. An electron does not tunnel across the depletion region if the width of the
layer is more than 10–8 m. What is the minimum doping needed for a
silicon tunnel-diode to operate? Take ne ≈ nh for the two impurities, κ ≈ 12
for Si.
19. A silicon pn junction has 10 23 gallium atoms/m 3 and 10 22 arsenic
atoms/m3. What is the approximate potential difference across the junction?
20. The main contribution to parmagnetism of copper sulphate comes from
the copper ions which have spin 1/2. Show that its magnetization is given
by

 e   eB 
M = N  tanh  .
 2m   2mkT 
21. For a substance containing paramagnetic ions with S = 1/2 and orbital
angular momentum quenched, derive an expression for the energy and
specific heat of the substance. Discuss its high- and low-temperature limits.
22. A magnetic field is applied to a salt containing Cu2+ ions. Given that
Cu2+ has nine 3d electrons, determine the field at which 90% of the ions
are in the ground state at 1 K.
23. Show that the magnetization in a ferromagnet tends to a value (use
Eq. (8.99) with a = egλ M/2m)

Neg J  1   eg 2 Ng λ  
M→ 1 − exp  −   
2m  J   2m  kT  

for T → 0 K. However, a more sophisticated calculation in terms of spin
waves and magnons shows that the second term vanishes at T3/2 rather
than an exponential.
9
The Nucleus

Structures of the Chapter


9.1 Properties of the nucleus
9.2 Nuclear forces
9.3 Models of the nucleus
9.4 Weizsacker’s mass formula
9.5 Nuclear stability
9.6 Nuclear reactions
9.7 Fission reactors
9.8 Thermonuclear fusion
9.9 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 317
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_9
318 Elements of Modern Physics

Rutherford’s experiment (1911) indicated the existence of a heavy, positively


charged nucleus of very small dimensions near the centre of the atoms. It was
found that the scattering of α-particles i.e. ionized helium atoms, by a thin,
metal foil could be described by assigning a charge of Ze to this nucleus.
However, it was also noted that when the distance of closest approach was less
than 10–14 m, the scattering showed deviations from Coulomb scattering,
indicating the extension of the nucleus over a distance of 10–14 m. Indeed, the
descriptions of the details require the existence of a new, shortchange interaction
known as the strong interaction. The properties of the nuclei and their interactions
are quite different from those of an atom, and will be briefly discussed here.
Some of the important properties of the nucleus, such as its mass, size,
magnetic moment, etc. will be discussed first. This will be followed by an
analysis of nuclear forces and different models of the nucleus. Finally, stability
criteria, nuclear reactions and fission and fusion processes will be discussed. At
this stage, it may be mentioned that while modern experimental techniques
have provided quite detailed information about nuclear properties, an entirely
satisfactory framework for the quantitative prediction of these properties has
not been formulated. This is primarily because (i) nuclear force appear to be
structurally much more complicated than electromagnetic forces, and (ii) the
strength of nuclear forces is quite large which means that perturbative methods
cannot be used for calculation.

9.1 PROPERTIES OF THE NUCLEUS


Nuclear properties are most simply described in terms of the nuclear constituents.

Nuclear Constituents
The nucleus is made up of protons and neutrons. The proton is the nucleus of
the simplest atom, the hydrogen atom. It has a rest energy of 938.256 MeV or a
mass of 1.0072766 mu (1 atomic mass unit, mu, is equal to 1/12 of the 12C mass
and corresponds to 931.478 MeV), a positive charge of e and spin  /2. The
neutron has a rest energy of 939.550 MeV or a mass of 1.0086654 mu, zero net
charge and spin  /2. Since protons and neutrons have half-integral spin, they
are fermions and satisfy Fermi-Dirac statistics. The near-equality of the neutron
and proton masses is an important property and it has a bearing on nuclear
interactions.
Both the proton and the neutron have magnetic moments given by
e
µp = g p s (9.1)
mp
e
µn = − g n s
mp
The Nucleus 319

where gp = 2.793, gn = 1.913. A Dirac particle without any structure would be


expected to have gp = 1 and gn = 0. The observed values for the magnetic moments
suggest a complicated structure for the proton and the neutron. It may be noted
that these magnetic moments are smaller than the electronic magnetic moments
by a factor of about me/mp.
While the proton is stable (some recent theories however, predict that the
proton decays with a lifetime of about 1031 – 1032 years), the neutron decays,
n→ p+e+v (9.2)
where v is the antineutrino (neutrino v and its antiparticle antineutrino v are
zero-mass, spin 1/2, neutral particles, see Sec. 10.1), with a half-life of about 11
minutes. The decay process is known as the neutron β-decay. Protons and
neutrons are together known as nucleons.

Binding Energies
Since nuclear forces are strong, nuclear binding energies are a significant fraction
of nuclear masses. Thus, nuclear binding energies can be obtained from the
masses.
The binding energy Eb of a nucleus of mass mA, containing Z protons and
(A – Z) neutrons, is
Eb = c2 [Zmp + (A – Z)mn – mA] (9.3)
This is the minimum energy required to separate the nucleus into its
constituent nucleons. A nucleus with Z protons and A nucleons is said to have a
A
charge or atomic number Z and mass number A, and is designated by Z X
where X stands for the chemical symbol of the nucleus (e.g. one has 11 H, 12 H, 13 H
nuclei with 0, 1, 2 neutrons respectively). Nuclei with the same number of
protons but different number of neutrons are known as isotopes e.g. 11 H, 12 H etc.
or 16 17 ,
8 O, 8 O
etc. Nuclei with the same number of neutrons but different number
of protons, are called isotones, e.g. 42 He, 53 Li , and nuclei with the same A but
different Z are called isobars, e.g. 52 He, 53 Li . One also has some nuclei which
are excited states of a stable nucleus, but which have a very long lifetime (say
τ > 0.1 s). These are called isomers.
A very useful concept in nuclear physics is the binding energy per nucleon,
Eb/A. It starts from a very low value of Eb/A ≈ 1.1 MeV for the deuteron
(Eb ≈ 2.225 MeV), rapidly increases to 7.1 MeV for the α-particle, i.e. 42 He,
reaches a peak value of about 8.7 MeV for A ≈ 56. For nuclei with larger A, it
320 Elements of Modern Physics

decreases slowly reaching a value of about 7.5 MeV for the heaviest natural
element, uranium. The general dependence of the binding energy per nucleon
on the mass number is shown in Fig. 9.1, for the stable nuclei. Two important
results follow from the general behaviour of Eb/A: Energy can be released (i) in
the fission of a heavy nucleus into lighter nuclei, and (ii) in the fusion of lighter
nuclei into a heavier nucleus. For example, a nucleus with A = 220 (Eb/A ≈ 7.5
MeV), breaking into two nuclei with A = 110 each (Eb/A ≈ 8.5 MeV) will liberate
an energy of about 220 × (8.5 – 7.5) = 220 MeV. Similarly two 12 H
nuclei (E b/A ≈ 1.1 MeV) can combine into a 42 H nucleus (Eb/A ≈ 7.1 MeV) to
liberate an energy of about 4 × (7.1 – 1.1) = 24 MeV. These energies are very
large compared to the few electron volts released in chemical reactions which
are governed by electromagnetic forces.

10
Ni Mo
144Nd
0 208Pb
8
He

Eb
A
4

2 2
H

0
0 40 80 120 160 200 240
A

Fig. 9.1 The general behaviour of the binding energy per nucleon as a function of A.

Though the fission and fusion processes leading to nuclei with A ≈ 55 are
feasible, it is observed that most of the nuclei are stable. The reason for this is
that before a heavy nucleus breaks up, the components must go through an
intermediate state with higher energy than the ground state (this can be induced
by providing extra energy available in the capture of a neutron). Similarly, lighter
nuclei encounter a higher energy intermediate state with large Coulomb
repulsion, before they can combine (the fusion can take place at high temperature,
e.g. in stars).
The Nucleus 321

Unstable Nuclei
If a lower energy state is available to a nucleus, it will, in general, be unstable
and decay with the emission of a photon, or an α-particle, or some leptons (i.e.
e, v etc. see Sec. 10.1), provided the basic conservation laws, such as conservation
of energy, momentum, charge, etc. allow the decay. The neutron β-decay in
Eq. (9.2) is one such example. These decays are characterized by the lifetime τ
of the nucleus (the lifetime was discussed in Sec. 6.4), such that the number of
undecayed nuclei N(t) is given by
N(t) = N(0) exp (– t/τ) (9.4)
where N(0) is the number of nuclei at time t = 0. The lifetimes of the nuclei vary
from the unmeasurably small values of τ < 10–6 s (they may be indirectly
estimated), to the very large values of τ ~ 10100 years.
The important classes of nuclear decays are the following:
1. α-decay: It may be described by the process
A
ZX → ZA −− 42 Y + 24 He (9.5)
an example of which is
238
92 U → 234 4
90 Th + 2 He (9.6)
2. γ-decay: In a γ-decay, an excited nucleus undergoes transition to a lower
energy state by the emission of a photon. This may be represented by
X* → X + γ (9.7)
where X* is the excited state. The photon energies in nuclear transitions
are of the order of an MeV compared with the few eV in atomic transitions,
and the corresponding lifetimes are of the order of 10–14 s (compared to
t ~ 10–15/108 = 10–23 s required for a relativistic particle to traverse a
nucleus).
3. β-decay: These processes involve electrons and neutrinos, and are
exemplified by
A
ZX → Z +A1Y + e + v (9.8)
A
ZX → Z −A1Y + e + v (9.9)
A
ZX + e → Z −A1Y + v (9.10)

where e and e are the electron and the positron, and v and v are the
neutrino and the antineutrino. In the electron-capture process [Eq. (9.10)],
the absorbed electron is usually from the atomic shells.
The activity of the unstable nuclei, known as radioactivity, is measured in
terms of the curie which corresponds to 3.7 × 1010 disintegrations/s.
322 Elements of Modern Physics

Nuclear Radius
The wave-function description of a particle does not provide an unambiguous
description of the size of a particle. However, since nuclear forces are large
only within a distance of a few fermis (1 fermi = 10–15 m), it is useful to consider
the size of the nucleus. The nuclear radius may be estimated from the scattering
of neutrons and electrons by the nucleus, or by analysing the effect of the finite
size of the nucleus on nuclear and atomic binding energies.
Fast neutrons of about 100 MeV energy, whose wavelength is small
compared to the size of the nucleus, are scattered by nuclear targets. The fraction
of neutrons scattered at various angles can be used to deduce the nuclear size.
For example, in the scattering of a high energy particle by a hard sphere, V = ∞
for r < R, V = 0 for r > R, all the incident particles within a cross-sectional area
of 2πR2 are scattered. The factor of 2 is due to the diffraction of the waves at the
edges. The results of these experiments indicate that the radius of a nucleus is
given by
R ≈ r0 A1/3 (9.11)
where A is the mass number and r0 ≈ 1.3 – 1.4 fm. The scattering can be done
with proton beams as well. In this case, however, the effects due to Coulomb
interaction have to be separated out. The observations are in agreement with
the result in Eq. (9.11) with r0 ≈ 1.3 – 1.4 fm.
The scattering of fast electrons of energy as high as 104 MeV, with a
wavelength of about 0.1 fm, has the advantage that it can directly measure the
charge density inside a nucleus. The results of the experiment are in agreement
with Eq. (9.11) but with a somewhat smaller value of r0 ≈ 1.2 fm. The slight
difference in the value of r0 may be ascribed to the fact that the electron scattering
measures the charge density whereas the neutron and proton scattering
experiments measure the region of large nuclear potential, which may be
expected to be somewhat larger than the size of the nucleus.
The finite size of the nucleus modifies the atomic potential (– Z/r) at short
distances. This gives rise to a small separation between the spectral lines of
atoms with the same Z value but different A values—this is known as isotope
shift. The shifts can be used to deduce the nuclear ratius. The isotope shift is
much larger in muonic atoms (which have a muon in place of an electron) since
the radii of the muonic orbits are smaller than the electronic orbits by a factor of
about 200 (mµ ≈ 200 me). However, the accuracy of measurements is muonic
atoms is lower since the muons have a short lifetime, about 2 × 100–6 s. Finally,
the measurement of differences in the binding energies of mirror nuclei can
give an estimation of the nuclear radius. The mirror nuclei are nuclei which are
identical except that one proton is replaced by a neutron. They may be
2Z + 1 2Z + 1
characterized by Z + 1 X, ZY . The difference between their binding energies
The Nucleus 323

may be ascribed to the two different charges. A model calculation with assumed
charge distribution then provides an estimation for the nuclear radius. All these
approaches are in essential agreement with Eq. (9.11) with r0 ≈ 1.2 fm.
An important consequence of Eq. (9.11) is that the volume per nucleon is
the same for all nuclei:

V1 = (r0 A1/3 )3 / A
3
4π 3
= r0 (9.12)
3
Thus, the nuclear density is the same for all nuclei. The result is in agreement
with what might be expected from the strong, short-range forces in nuclei.
Furthermore, it implies that the nuclear forces are independent of the charge of
the nucleons. This is known as charge independence on nuclear forces.

Angular Momentum and Magnetic Moment


The total angular momentum of nuclei is made up of the spins and orbital angular
momenta of the constituent nucleons. Associated with the angular momentum
is a magnetic moment.
The angular momentum of the nuclei can be deduced from the hyperfine
interaction between the magnetic moments of the nuclei and of the electrons.
This interaction is of the form
H = A I.J (9.13)
with I and J being the angular momenta of the nucleus and of the electrons
respectively. The atomic states are characterized by the total angular momentum
F=I+J (9.14)
and the corresponding quantum number takes on the values
F = J + I, J + I – 1, ..., | J – I | (9.15)
The shift in the energy of these states is given by

1 2
∆E = A [ F ( F + 1) − I ( I + 1) − J ( J + 1)] (9.16)
2
which leads to a separation of
EF –EF – 1 = A 2 (I + J), A 2 (I + J – 1), ..., A 2 (| I – J | + 1)
(9.17)
between the successive states in Eq. (9.15). The analysis of the spectral lines
corresponding to these levels gives the value of I (and also of J).
The spin of the nuclei can also be determined from the spectra of
homonuclear molecules. It was observed in Chapter 5 that the transitions in
324 Elements of Modern Physics

para and ortho modifications of a homonuclear molecule, have intensities in


the ratio of I/(I + 1) so that the rotational band will show alternating intensity. A
measurement of these intensities allows the determination of the angular
momentum I of the nucleus especially for small I values. It is worth noting that
this method depends on the exchange symmetries of the wave functions and
not on the magnetic moment associated with the nucleus.
The magnetic moment of a nucleus is associated with its angular
momentum I, and may be expressed (at least for the purpose of calculating the
expectation values within multiplets) as
e
µ= gI (9.18)
mp
The gyromagnetic ratio g can be obtained from the nuclear magnetic
resonance experiments using atomic beams. From the resonance frequency
e
ω= gB (9.19)
mp
g and hence the magnetic moment is obtained.
Empirically, it is observed that even A and even Z nuclei have zero angular
momentum and zero magnetic moment, even A and odd Z nuclei have integral
angular momentum, and odd A nuclei have half-integral angular momentum.
The angular momenta of nuclei are generally found to be small. These
observations suggest that the angular momenta of protons, as also on neutrons,
separately compensate one another. This has a bearing on the validity of nuclear
models.

Electric Quadrupole Moment


A nucleus is usually non-spherical (though in a few cases it may be spherical).
The distortion which is along the axis of rotation is expressed in terms of the
electric quadrupole moment Q,

1
Q=
e ∫ (3z 2 − r 2 ) ρ (r ) dV (9.20)

where ρ(r) is the charge density distribution in the nucleus. For a spherically
symmetric ρ(r), Q is zero whereas for an ellipsoidal (ellipsoid is obtained by
rotating an ellipse about one of its axes) distribution,

2q 2 2
Q=   (a − b ) (9.21)
5e
The Nucleus 325

where a is the semi-axis along the axis of rotation and q is the total charge
(Q > 0 implies and elongated or prolate nucleus and Q < 0 implies a flattened
or oblate nucleus).
The nuclear quadrupole moments are determined from their effect on the
hyperfine structure of the atomic spectra. The observed values of Q range from
Q = – 10–28 m2 for 123Sb to 8 × 10–28 m2 for 176Lu, while deuteron has a value of
Q = 2.74 × 10–31 m2.

9.2 NUCLEAR FORCES


The forces that bind nucleons together into a nucleus are very strong forces as
indicated by the large binding energies, and have a very complicated structure.
These forces are described by what is known as strong interactions. Several
important characteristics of these forces follow from a general analysis of the
nuclear properties.
1. The nuclear forces are strong, their magnitude being roughly 100 times
that of electromagnetic forces. This follows from the large nuclear binding
energies.
2. The nuclear forces have a short range. They are dominant over a distance
of about 1 fm but vanish rapidly at distances greater than a few fermis.
This explains the approximate constancy of nuclear density as well as of
binding energy per nucleon. Roughly speaking, the short range of the
forces implies that each nucleon interacts with only a small number of
nearby nucleons.
3. Nuclear forces are independent of the nuclear charges. It is indeed a
striking property that the proton and the neutron have nearly the same
mass. It is convenient to ascribe to the nucleons an isotopic spin (or
isospin for short) τ = 1/2, the τ2 – 1/2, – 1/2 states corresponding to the
proton and the neutron respectively. The properties of the isospin are
similar to those of the ordinary spin, and the charge independence of the
nuclear forces is equivalently described by the statement that the nuclear
forces are independent of the orientation of the isotopic spin.
4. Nuclear forces are not central forces. In particular, they depend on the
orientation of the spin. This is forcibly demonstrated by the observation
of the deuteron as an S = 1 bound state of a proton and a neutron; no such
bound state is observed in the S = 0 state.
Some important aspects of the nuclear forces are described subsequently.

Yukawa Forces
One of the important modern ideas of forces is that forces between particles
arise from the exchange of particles. The form of the resulting potential can be
deduced from the following arguments.
326 Elements of Modern Physics

The electromagnetic potential (only the scalar potential φ is considered


here) satisfies the equation

 2 1 ∂2 
∇ − 2 2  φ = 0 (9.22)
 c ∂t 
in free space. This equation may be regarded as arising from the relation
1
2 2
( E 2 − c 2 p 2 ) = 0 for the photon with zero mass, by the quantum
 c
mechanical replacements in Eqs. (3.10) and (3.11). For a particle with nonzero
mass m, one may instead start with the relation

1
( E 2 − c 2p 2 − m 2 c 4 ) = 0 (9.23)
 2c 2
Implementing the quantum mechanical replacements in Eqs. (3.10) and
(3.11) and including a point source, the equation for the potential comes out to
be

 2 1 ∂ 2 m 2c 2 
 ∇ − 2 2 − 2  φm = g δ(r) (9.24)
 c ∂t  
where g is a constant. The static solution to this equation is found to be
g
φm = − exp [− r (mc/h)] (9.25)
4πr
which is the well known Yukawa potential. This potential has an approximate
range of r0 given by
r0 =  /mc (9.26)
i.e. essentially the Compton wavelenght of the quantum of the field exchanged.
The potential decreases very rapidly for r >> r0. Yukawa argued that the nuclear
forces, which have a range of r0 ≈ 10–15 m, arise from the exchange of a particle
of mass m ≈ h/r0 c, which for r0 ≈ 10–15 m, comes out to be (expressed as rest
energy)
m ≈ 200 MeV (9.27)
The π-meson with a mass of about 140 MeV, would be a good candidate for
the quantum whose exchange gives rise to nuclear forces. Of course, there are
additional contribution to the interaction from the exchange of other, heavier
particles but with correspondingly shorted ranges [see Eq. (9.26)].
The short-range nature of the nuclear force is due to the rapidly-decreasing
exponential function. In contrast, the electromagnetic forces arising from the
exchange of zero-mass photons, have a long range.
The Nucleus 327

Nucleon-Nucleon Interaction
The forces between two nucleons from the basis of nuclear interactions. An
important part of these forces is from the exchange of π-mesons.
A nucleon is continuously emitting and absorbing virtual π-mesons, and is
effectively surrounded by a cloud formed by them. These processes are regarded
as virtual since total energy and momentum conservation would forbid them,
but the uncertainty principle allows them to take place over short distances and
times. The mesons emitted by the nucleon may be absorbed by another nucleon
(Fig. 9.2). This gives rise to the forces between the nucleons.
The exchange of a charged meson [Fig. 9.2(b)] gives rise to charge-exchange
forces. They are observed, for example, when a beam of neutrons passes through
hydrogen. The exchange of charged mesons converts some of the neutrons into
protons which are then observed in the beam with almost the same energy and
momentum as the initial neutrons. Similar to the exchange of charges are
processes that lead to an exchange of spin, and exchange of charge and spin.
There are additional forces due to relativistic corrections, many-body forces, etc.
Clearly, the nuclear forces, even in a two-nucleon system, are very complicated.

p n p p n n
0 0 0
p p p

p n p p n n

(a)

n p n p
+ –
p p

p n p n

(b)

Fig. 9.2 Diagrammatic representation of pion-exchange interaction:


(a) the exchanged π0 may travel in either direction,
(b) exchange of π+ and π–.

Though nuclear forces are complicated, their charge independence leads to


some simple relations. In particular,
Vpp ≈ Vnn (9.28)
Vpp ≈ Vnp in the same state
328 Elements of Modern Physics

The second relation, Eq. (9.28), is valid only for the states which are allowed
by Fermi-Dirac statistics for the two protons, i.e. even l for S = 0 and odd l for
S = 1. The electromagnetic interaction will violate charge independence and
introduce small corrections to the above relations.
The model of a nucleon surrounded by a cloud of virtual π-mesons, provides
a qualitative explanation for the magnetic moments of the proton and the neutron.
In this picture, a neutron spends part of the time in the virtual (p + π–) state. In
this virtual state, the orbital motion of π– gives rise to a substantial negative,
magnetic moment. Similarly a proton spends part of the time in the virtual
(n + π+) state. In the (n + π+) state, the orbital motion of π+ gives rise to a large
positive, magnetic moment. These descriptions are in qualitative agreement
with the observed gyromagnetic ratios for the neutron and the proton [Eq. (9.1)].

Strength of Nuclear Interaction


The strength of nuclear interaction can be estimated from the binding energy of
the deuteron. The energy of deuteron may by written as:
1 1
E= p12 + p22 + V (9.29)
2m p 2mn
If the deuteron is regarded as a sphere of radius r0, the uncertainty principle
gives:

h2
E≈ +V (9.30)
m p r02
where V is the average potential energy. Taking r0 ≈ 1 fm and E ≈ – 2 MeV
(binding energy of the deuteron)
V ≈ – 40 MeV (9.31)
This may be compared with an electrostatic energy of about 1 MeV when
two protons are separated by a distance of about 1 fm. Nuclear interaction is
thus seen to be such stronger than electromagnetic interaction, which explains
the term strong interaction used to describe it.

9.3 MODELS OF THE NUCLEUS


An investigation of the properties of a nucleus with several nucleons is
immensely difficult both because of the complexity of two-nucleon forces and
the absence of a dominant central force. Therefore, model calculations, each of
which has a limited aim of investigating only certain aspects of the nuclear
properties, have to be done. Here, a few nuclear models which together provide
some understanding of the overall structure of the nucleus are considered.
The Nucleus 329

Shell Model
In the shell model, nucleons are assumed to move independently of each other
in an average, centrally-symmetric potential. They occupy discrete energy levels
in this potential, taking the Pauli exclusion principle into account. The grouping
together of some of the energy levels gives rise to a shell structure of the
nucleus. The independent motion or equivalently the long, mean free path is
partially justified by the Pauli principle which forbids transitions to states that
are already occupied.
Experimentally, it is found that nuclei with the number of neutrons or protons
equal to,
Z or (A – Z) = 2, 8, 20, 28, 50, 82, 126 (9.32)
are especially stable. These numbers are known as magic numbers. The stability
is particularly pronounced for nuclei with both the number of protons and
4 16 40 48 208
neutrons equal to magic numbers, e.g. 2 He, 8O, 20Ca, 20Ca, 82 Pb . The stability
of 42 He leads to its being the only composite nucleus emitted in radioactive
decays. It is also found that the 3rd, 9th, 21st, 29th, 51st, 83rd and 127th neutron
or proton is loosely bound (in fact 52 He is unstable).
For describing the shell structure of nuclei, two of the potentials used are
the spherical-well potential and the simple harmonic oscillator potential. Since
the oscillator levels have already been deduced (Sec. 3.12, example 6), the
details for this case are given. The wave functions of the 3-dimensional oscillator
are products of the 1-dimensional wave functions and the energies are the sums
of the corresponding, 1-dimensional, equispaced energy levels:
E =  ω (n + 3/2), n = nx + ny + nz (9.33)
Now, corresponding to each n level, there are several degenerate states, e.g.
for n = 1, there are three states with nx or ny or nz equal to 1. These states
correspond to states with different angular momenta (n = 0, l = 0), (n = 1, l = 1),
(n = 2, l = 2, 0) etc. with the number of corresponding states (taking spin states
into account), being 2, 6, 12 etc. However, the actual potential cannot be
simulated by the oscillator potential at large distances. Since it goes to zero at
large distances, the larger angular momentum states are lowered with respect to
the smaller angular momentum states. The ordering of the energy levels with
the degeneracy removed is shown in Fig. (9.3). It should be noted that while
these levels explain the magic numbers 2, 8, 20 corresponding to closed shells,
they cannot explain the other magic numbers.
330 Elements of Modern Physics

s
d 11/2
g
n=6 i 126
13/2
p
3/2
5/2
f 7/2
n=5
9/2
h
11/2 82
s
n=4 d 3/2
5/2
7/2
g
50
9/2
n=3 p 1/2
3/2
5/2
f 28
7/2

s 20
n=2 1/2
3/2
d
5/2

n=1 8
p 1/2
3/2

s 1/2 2
n=0
Oscillator levels Perturbed Spin-orbit Shells
oscillator coupling
levels

Fig. 9.3 Nuclear energy shells arising from the perturbed


oscillator levels with spin-orbit coupling.
To account for the observed magic numbers, Mayer and Jensen (1949)
postulated a strong spin-orbit interaction. The postulated interaction is between
the spin and the orbital angular momentum of each nucleon, e.g. an interaction
of the form l.s, so that each l level splits into two levels j = l ± 1/2 except for
l = 0 for which j = 1/2, with the j = l + 1/2 state being lower. The resulting
energy levels clearly reproduce the shells corresponding to magic numbers
(Fig. 9.3).
Superheavy nuclei: The energy levels based on the average simple harmonic
or spherical-well potential provide only a general description of the shell
properties. For the finer properties, more detailed calculations have to be carried
out. Such calculations produce reordering of some of the energy levels shown
in Fig. 9.3, especially when the number of particles is large. In particular, there
are indications that 114 is a magic number for protons. There are also indications
that 184 is a magic number for neutrons. This has given rise to speculations
regarding the existence of long-lived superheavy nuclei with Z near 114 and
A – Z near 184. For example, it has been predicted that a nucleus with Z = 110
and A = 294 has a half life of about 108 years.
The Nucleus 331

Angular momenta: The shell model allows us to predict the angular


momenta of the nuclei, for example, the angular momentum of a closed shell is
predicted by the Pauli exclusion principle, to be zero. If further, it is postulated
that like nucleons in a shell pair off in such a way that their total angular
momentum is zero, it follows that (i) all even A, even Z nuclei have zero angular
momentum, (ii) the angular momentum of odd A nuclei is due to the odd nucleon.
These results are generally observed to be true with a few exceptions. Some of
23 5 5
the exceptions are 1 1 Na
with j = 3/2 instead of 5/2, 2 5 Mn
with j = 5/2 instead
79
of 7/2, and 3 4Se
with j = 7/2 instead of j = 9/2. In all other cases the predictions
209
are consistent with experimental observations, e.g. 83 B has j = 9/2.
Magnetic moments: The magnetic moments of odd-A nuclei can be
estimated under the assumption that they are due to the odd nucleon (the magnetic
moments of even Z, even A nuclei are zero, while those of odd Z, even A nuclei
are difficult to analyse). The magnetic moment of the nucleon is both due to its
spin as well as its orbital angular momentum and is given by
e
µ= (2 g s s + gl l ) (9.34)
2m p
where gs = 2.793 for the proton, gs = – 1.913 for the neutron, and gl = 1 for the
proton, gl = 0 for the neutron. As in the case of atoms (see Chapter 6), µ can be
expressed in terms of the total angular momentum j as
e
µ= (2 g s as + gl al ) j (9.35)
2m p

j . s j ( j + 1) + s ( s + 1) − l (l + 1)
where as = = (9.36)
j. j 2 j ( j + 1)
j . l j ( j + 1) − l (l + 1) + s ( s + 1)
al = = (9.37)
j. j 2 j ( j + 1)

1
Now, for a given j, the allowed values of l are j ∓
, and the corresponding
2
magnetic moments in units of e  /2mp called nuclear magneton, are
µ = gs + (j – 1/2) gl, l = j – 1/2,
j j ( j + 3/2)
µ= − gs + gl , l = j + 1/2 (9.38)
j +1 j +1
The plots of these moments as functions of j give what are known as Schmidt
lines. The experimental values of the magnetic moments ate not in good
332 Elements of Modern Physics

agreement with predictions of Eq. (9.38) but do lie between the two values.
This suggest the need for a more detailed analysis including a mixing of states,
e.g. the states may contain components in which the pairs of nucleons do not
pair off to give zero angular momentum states.
Quadrupole moments: The predictions of the shell model for electric
quadrupole moments are not in good agreement with the experimental values.
If the quadrupole moment of an odd Z, odd A nucleus is due to the last proton,
it should be approximately of the order
Q ≈ R2 (9.39)
where R is the radius of the nucleus. While this is the case for small nuclei,
some of the nuclei with large A, have Q as large as 10R2. Similar, large quadrupole
moments are observed for even Z, odd A nuclei as well. Many of these effects
are due to collective motions in nuclei, which are considered in the collective
model.
The shell model can be generalized by taking the average potential to be an
asymmetric harmonic oscillator potential. For example, in the Nilsson model
the force constant in the z-direction is taken to be different from those in the
x-and y-directions. This model retains the rotational symmetry in the z-direction
while being able to describe the observed large quadrupole moments of nuclei.

Collective Model
For nuclei with a closed shell plus one or a few nucleons, the elementary shell
model is quite successful in describing the nuclear properties. However, when
there are several nucleons outside the closed shell, the nucleus is significantly
deformed. The motion of the deformed nucleus gives rise to collective rotational
and vibrational levels of the nucleus.
In the deformed nucleus which is assumed to be ellipsoidal in shape, the
rotation can be of two types:
(i) it may be irrotational as in the case of tidal waves with no part of the
nucleus actually going around the nucleus,
(ii) the whole nucleus may rotate as a rigid body. Both these motions may
contribute to the rotational motion of a nucleus.
In even Z, even A nuclei, the angular momenta of the nucleons pair off to a
zero value, so that the total angular momentum is also the angular momentum
due to collective rotation. Accordingly, the rotational energy levels are given
by
2
EI = I ( I + 1) (9.40)
2I
where I is the moment of inertia and I is the total angular momentum quantum
number. However, since the remaining wave function (other than the rotational
The Nucleus 333

part) satisfies the required exchange symmetry, the rotational wave function
must be even under r → – r which effects an interchange of particles. Because
of the relation Y0m (π − θ, π + φ) = (− 1)l Yl m (θ, φ), this implies that only I = 0,
2, 4, ... are allowed. The observed energy levels for 238Pu shown in Fig. 9.4(a),
are in very good agreement with levels predicted by Eq. (9.40) with I even and
a moment of inertia
I ≈ 1.4 × 10–54 kg. m2 (9.41)
E E
in Me V 1 1 in Me V
+
0.514 8 +
9
3.44
2

+
0.3036 6
+
7
1.61
+
2
0.1460 4

+
0.0441 2 +
+ 5
0
2
238 25
Pu Al
(a) (b)

Fig. 9.4 Energy levels for collective rotation for (a) 258Pu, even
Z, even A nucleus, (b) 25Al, odd A nucleus.
This is quite large compared to the value of mpR2 ≈ 10–55 kg. m2 expected
for the motion of a single nucleon, thus justifying the interpretation in terms of
collective motion.
For odd A nuclei, the angular momentum is due both to the angular
momentum of the odd nucleon and to collective rotational motion. Since the
nucleon moves in a hemispherical potential, only the component of its angular
momentum along the axis of symmetry is a constant of motion. This component,
designated by Ω, adds vectorially to the collective angular momentum R which
is perpendicular to the axis of symmetry, to give the total angular momentum I.
1
When Ω ≠ , the rotational levels are given by
2

2
EI = [ I ( I + 1) − I 0 ( I 0 + 1)] (9.42)
2I
334 Elements of Modern Physics

where I0 corresponds to the ground state angular momentum, and I takes on


5
values I0, I0 + 1, I0 +2, ... . The observed values of the positive parity Ω =
2
levels for 25Al, are in good agreement [see Fig. 9.4(b) with these predictions
1
with I ≈ 1.36 × 10–55 kg. m2. The analysis of Ω = states is more complicated
2
and may be found in specialized books.
In addition to rotational states, collective motion leads to vibrational states
also. The vibrational levels may be analysed in terms of quanta of vibrational
energy, called phonons. These phonons carry two units of angular momentum
and an energy of about 1 MeV. In multi-phonon excitations, the angular momenta
of the phonons add vectorially and produce closely-spaced levels with the same
number of phonons but different angular momenta. Such levels have been
identified experimentally.
The deformation of the core, by the nucleons outside the closed shell, can
lead to large quadrupole moments. That this is the correct explanation of the
large observed quadrupole moments, is indicated by the fact that large
deformations may be expected odd A and odd Z nuclei as well as in odd A and
odd (A – Z) nuclei. This would lead to large quadrupole moments for odd Z as
well as odd (A – Z) nuclei as indeed it is observed experimentally.
All in all, collective motion is an important component of nuclear structure
and energy levels.

Degenerate Gas Model


The degenerate gas model is similar in spirit to the free-electron theory of metals
(Sec. 7.5). Here it is assumed that noninteracting nucleons are confined to the
volume of the nucleus. The predictions of the model are discussed here briefly
so as to emphasize the importance of the Pauli exclusion principle in determining
the properties of the nucleus.
As shown in Sec. 7.5, the number of fermions occupying the lowest energy
levels in a box of volume V is given by [Eq. (7.89)]
8πV 3
n= p (9.43)
3h3
where p is the maximum momentum of the occupied states, p2 = 2mE.
4 3
Substituting V = πr0 A, the maximum kinetic energy Em is
3
The Nucleus 335

2/3
n
Em(n) = C   (9.44)
 A
2/3
 9  h2
where C=  2 
 32π  2mr02
≈ 52 MeV (9.45)
for r0 ≈ 1.2 fm. Similarly the total kinetic energy is given by

p2
Er =
∫ 2m
dn

3
= n Em (9.46)
5
For a nucleus with Z protons and (A – Z) neutrons, the total kinetic energy
is given by
3C
E= 2/3
[ Z 5/3 + ( A − Z )5/3 ] (9.47)
5A
1
If Z = A, the kinetic energy of the last nucleon is obtained from Eq. (9.44)
2
to be E ≈ 33 MeV. Since the binding energy of the last nucleon is about 8 MeV,
the average depth of the potential is of the order of 41 MeV. This is in agreement
with the earlier estimation is Eq. (9.31) based on the uncertainty principle. It is
also seen that for a given A,

∂E A
= 0 for Z = (9.48)
∂Z 2
i.e. the most stable nucleus has equal number of protons and neutrons. Expanding
1
E about Z = A,
2
2
3C 2(21/3 ) C  1 
E= A +  Z − A  + ... (9.49)
5(2)2/3 3A  2 
The second term, which gives the increase in the energy because of the
imbalance of the protons and neutrons, is
2
 1 
 Z − A
2 
δE ≈ 43.7  MeV (9.50)
A
336 Elements of Modern Physics

In this analysis, Coulomb interaction which increases the potential for the
protons has been ignored. Because of the Coulomb interaction, in stable, heavy
nuclei, one finds more neutrons than protons. An important point brought out
by the model is that the Pauli principle would push up the energy of a nucleus
1
with Z very different from A, and implies that the least-energy state is the
2
one with nearly equal number of protons and neutrons.

9.4 WEIZSACKER’S MASS FORMULA


Many of the nuclear properties are critically controlled by the masses or
equivalently the binding energies of the nuclei. For example, a nucleus can
decay only into a set of particles with lower masses, with the difference in the
masses appearing as kinetic energy of the final particles. Here, a semi-empirical
formula for the masses of the nuclei is briefly discussed. This formula is not
only useful in the discussion of the stability of the nuclei but also provides a
useful insight into the physical properties that determine their masses.
The total energy of a nucleus is largely controlled by the short-range nature
of the forces leading to saturation in binding, the Pauli exclusion principle, and
Coulombic repulsion between protons with some finer effects due to pairing of
like-nucleons.
To start with, the major contribution to the mass of the nucleus comes from
the masses of the nucleons.
M1 = Zmp + (A – Z)mn (9.51)
Since nuclear forces are of short range, the binding energy of the nucleus is
proportional to the number of nucleons A, so that the mass is reduced by
M2 = – a1A (9.52)
However, nucleons near the surface are less tightly bound which may be
taken into account by a term
M3 = a2A2/3 (9.53)
proportional to the surface area. The electrostatic, repulsive interaction may be
incorporated by noting that the electrostatic energy of a sphere of charge Ze is
proportional to Z2/R. This brings in a contribution
M4 = a3Z2 A–1/3 (9.54)
The Pauli exclusion principle brings in a term
2
 1 
 Z − A
2 
M5 = a4  (9.55)
A
The Nucleus 337

because of the imbalance of protons and neutrons. Finally, it is noted that the
1
Pauli principle allows pairs of protons and neutrons with spin to occupy the
2
same energy state whereas and odd proton of neutron is forced to go into a
higher energy state. This effect is included by a pairing term

δ( A) for odd Z and odd ( A − Z )


du

δ=  0 for odd A
− δ( A) for even Z and even ( A − Z ) dy
(9.56)
The final formula for the mass of a nucleus is
M(Z, A) = Zmp + (A – Z)mn – a1A + a2A2/3 + a3Z2A–1/3
2
 1 
 Z − A
 2 
+ a4 +δ (9.57)
A
The constants in Eq. (9.57) are determined empirically, from a fit to the
observed masses. The best fits are obtained for
a1 = 15.7, a2 = 17.8, a3 = 0.710 (9.58)
a4 = 94.8, δ(A) = 33.6 A –3/4

all in MeV. The expression in Eq. (9.57) is known as the Weizsacker mass
formula and the values in Eq. (9.58) give the best fit to the binding energy plot
in Fig. (9.1). This formula is of considerable use in the analysis of the stability
of nuclei.

9.5 NUCLEAR STABILITY


A nucleus can decay by emitting or absorbing electrons (β-decay), emitting
α particles (α-decay), emitting protons or neutrons, emitting γ ray (γ-decay) or
breaking into smaller nuclei (fission). Each decay mode is characterized by a
decay probability λ defined by
dN(t) = – λN(t) dt (9.59)
in terms of which
N(t) = N(0) e–λt (9.60)
λ is called the lifetime τ of the nucleus. If a nucleus can decay via several
–1

modes, then its lifetime is the inverse of the sum of decay probabilities λi,
−1
 
τ= 


i
λi 

(9.61)
338 Elements of Modern Physics

This quantity τ is the average lifetime of the nucleus (see Sec. 6.4). For
example 238U has a lifetime of about 6.5 × 109 years, comparable with the age of
the universe.
The allowed decays in general must satisfy certain conservation laws such
as charge conservation, energy-momentum conservation, etc. γ-decays involve
changes only in the energy levels, the components of the nucleus remaining the
same. Here, we concentrate on β-decay, α-decay, and fission, which alter the
composition of the nucleus.
Conservation of energy allows only those decays which satisfy the following
rules:
1. A nucleus is unstable against the emission of an electron [Eq. (9.8)], i.e.
β–-decay, if
M(Z, A) > M(Z + 1, A) + me (9.62)
It is unstable against the absorption of an electron [Eq. (9.10)] if
M(Z, A) + me > M(Z – 1, A) (9.63)
It is suitable against the emission of a positron [Eq. (9.9)], β+-decay, if
M(Z, A) > M(Z – 1, A) + me (9.64)
2. A nucleus is unstable against breakup into two fragments if
M(Z, A) > M(Z′, A′) + M(Z – Z′, A – A′) (9.65)
In particular, it may decay by emitting a portion if
M(Z, A) > M(Z – 1, A – 1) + mp (9.66)
or by emitting an α particle if
M(Z, A) > M(Z – 2, A – 4) + mα (9.67)
It may be noted that because of the large binding energy per nucleon, the
decays via the emission of a heavy particle are important mainly in heavy nuclei.
For determining the stability pattern of lighter nuclei, it is necessary to consider
only electron emission or absorption.

Beta Decay
Consider first an odd A nucleus. The Z value for the most stable nucleus is
given by the condition

∂M ( Z , A)
=0 (9.68)
∂Z Z = ZA

which implies that [using Eq. (9.57)]

 1 
(mp – mn) + 2a3ZAA–1/3 + 2a4  Z A − A  A− 1 = 0 (9.69)
 2 
The Nucleus 339

Solving for ZA,


 mn − m p + a4   A 
ZA =  2/3    (9.70)
 a4 + a3 A   2 
For a3 = 0.710, it is seen that ZA is smaller than A/2 for A ≥ 3, and most of the
heavy nuclei will have more neutrons than protons.
Expansion of M(Z, A) about the point Z = ZA leads to
M(Z, A) = M(ZA, A) + (a3A–1/3 + a4A–1) (Z – ZA)2 + ... (9.71)
This relation is shown schematically in Fig. 9.5(a), for a given A. If the
mass of the electron me is ignored in Eqs. (9.62) and (9.64), it follows that the
integral Z value nearest to ZA corresponds to the stable isobar while the
neighbouring isobars will be unstable against either β–-decay or β+-decay. In
some rare cases, it may happen that ZA is essentially in between the two nearby
integral Z values, in which case there may be two stable isobars. Experimentally,
it is found that there is one stable isobar for each odd A nucleus, the only
exceptions being (113Cd, 113In), (115In, 115Sn) and (123Sb, 123Te).
Going over to the case of even A nuclei, the same procedure can be adopted
as in the case of odd A nuclei, i.e. expand M(Z, A) about the minimum at Z = ZA.
However, in this case, δ is not only nonzero but also takes on two values ± 33.6
A–3/4 corresponding to odd Z and even Z values. This leads to two plots for M(Z,
A) as a function of Z. In the typical case shown in Fig. 9.5(b), two stable isobars
(Z = 28, 30) are obtained. There can be a situation in which ZA may be close to
an even integer, say 2n and the masses of the nuclei with Z = 2n ± 2 may be
higher than those with Z = 2n ± 1 in which case only stable nucleus (with
Z = 2n) is obtained, e.g. 194Pt. It may also happen that the masses of all the three
nuclei with Z = 2n, 2n ± 2 may be lower than those with Z = 2n ± 1. In this case
three stable nuclei are obtained, with Z = 2n, 2n ± 2. Thus, in the case of even A
nuclei, one, two or three stable isobars may exist.

6
M(Z) – M(ZA) in M V

4
e


e
2
e –
e
0
101
Ru
42 43 44 45 46
Z
(a)
340 Elements of Modern Physics

16

12
M(Z) – M(ZA) in M V

8 –
e

e
4 64
Cu

e e 64
64 Zn
0 Ni

27 28 29 30 31
Z
(b)

Fig. 9.5 (a) Stability of nuclei with A = 101, (b) stability of nuclei with A = 64.
A point of caution: the usual masses of nuclei quoted are the masses of the
corresponding neutral atoms and hence include an additional mass of Z electrons.
For the sake of clarity, we refer only to the masses of the nuclei.

Alpha Decay
It is observed (Fig. 9.1) that the binding energy per nucleon decreases as nuclear
mass increases (A > 56). Therefore, a heavy nucleus would, in some
circumstances prefer to decay into lighter nuclei. However, a decay by emission
of only a proton or neutron is not observed since each nucleon in a nucleus has
a binding energy of about 8 MeV whereas the binding energy of a free nucleon
is zero. On the other hand, a decay by emission of an α particle (4He), which
has a binding energy of about 7.1 MeV per nucleon, is quite likely and is observed
in many nuclei.
The properties of α-decay, representing in Eq. (9.5), may be illustrated by
taking a specific example. Consider the α-decay of 212Bi,
Bi → 208Ti + 4He
212
(9.72)
If Tl is in its ground state, the initial mass exceeds the final mass by 6.203
MeV. This appears in the form of kinetic energy shared by the final particles.
Since momentum conservation implies that Tl and He have equal and opposite
momenta, their kinetic energies are inversely proportional to their masses. This
means that the α particle is ejected with a kinetic energy
The Nucleus 341

208 
Eα ≈ 6.203   ≈ 6.086 MeV (9.73)
 212 
Alternatively, Tl may be in one of its excited states [see Fig. 9.6 (a)] in
which case the kinetic energy of the α particle is

 208 
Eα ≈ (6.203 − ε)   (9.74)
 212 
where ε is the excitation energy of the state. Consequently, the α particle is
observed with essentially discrete kinetic energies, 27% of the times with 6.086
MeV, 70% of the times with 6.047 MeV, and the remaining 3% of the times
with the smaller allowed energies given by Eq. (9.74). The total lifetime
corresponding to these transitions is about 87.7 minutes. It is important to note
that subsequent to the α-decay, typically in about 10–8 to 10–15 s, the excited
Tl nucleus undergoes transitions to a lower energy state by emitting a γ ray. In
the general case, the excited nucleus may also loose its energy by emitting an
electron, a proton, a neutron or another α particle. Alternatively, the excess
energy may knock out one of the electrons in the atomic orbits. This is known
as internal conversion and is usually characterized by the emission of an x-ray
photon when the vacancy created by the ejection of the electron is filled by an
electron from the higher energy levels.
212
Bi

E in MeV

0.617

0.492 1/r
0.473

0.327

0.04 0 R r
0
208
Ti (b)
(a)

Fig. 9.6 (a) Alpha decay of 212Bi and the subsequent gamma decay of 208
Tl,
(b) tunnelling of α-particle wave function leading to alpha decay.
An important question which arises is the following: Why does the nucleus
not undergo and instantaneous decay to an energetically allowed state? The
reason for this is that when the nucleus decays, it goes through some states
342 Elements of Modern Physics

which have higher energy and the decay products must overcome the potential
barrier. For example, when Tl and He are just touching each other before
separating, they have an additional Coulomb energy
e 2 (81) (2)
Ec ≈ (9.75)
4πε 0 (rTl + r He )
Using the values for radii given by Eq. (9.11) with r0 ≈ 1.2 fm,
Ec ≈ 26 MeV (9.76)
Classically, this is a forbidden domain. Quantum mechanically, the α particle
can penetrate and tunnel through a potential [see Fig. 9.6 (b)] with some
probability. It is this property which permits the decay, with a finite lifetime.

Nuclear Fission
In some cases, it may be energetically favourable for a heavy nucleus to break up
into fragments of nearly equal masses, accompanied by the release of a large
amount of energy. Since heavy nuclei have an overabundance of neutrons, the
fission process is usually followed by the emission of a few neutrons. An example
of this is the fission of an excited 236U* nucleus:
236
U* → 144Ba + 89Kr + 3n (9.77)
An insight into the fission process is obtained by considering a simple model
in which a nucleus (A, Z) decays into only two fragments, (αA, βZ) and
[(1 – α)A, (1– β)Z]. The energy released in this process is given by the decrease
in the masses, which may be estimated from the empirical formula in Eq. (9.57):
∆E = 17.8 A2/3 [1 – α2/3 – (1 – α)2/3]
+ 0.71 Z2 A–1/3 [1 – β2 α – 1/3 (1 – β)2 (1 – α)–1/3]
+ 95 Z2 A–1 [ 1 – β2α–1– ( 1 – β)2 (1 – α)–1] (9.78)
It is easy to show that
∂ ∂ 1
(∆E ) = ( ∆E ) = 0 at α = β = (9.79)
∂α ∂β 2
so that the maximum energy released is
( ∆E)max ≈ – 4.6 A2/3 + 0.26 Z2 A–1/3 (9.80)
For A = 236, Z = 92, this has a value of about 180 MeV. However the
fission products, when just in contact, have an additional Coulombic energy
given approximately by
e 2 ( Z /2) 2
Ec ≈ (9.81)
4πε0 (2 R )
where R ≈ 1.2 (A/2)1/3 fm. For A = 236, Z = 92, Ec has a value of about 259 MeV
so that the fission products have to escape by tunnelling through this potential
barrier [Ec > (∆E)max]. The tunnelling is not necessary only if
The Nucleus 343

(∆E)max > Ec (9.82)


On using Eqs. (9.80) and (9.81) with R ≈ 12(A/2) fm, the inequality reduces
1/3

to
Z2 > 65 A (9.83)
This condition is not satisfied even in the case of the heaviest known nucleus
2 6 2
1 0 5 Ha (hahnium) for which Z2/A ≈ 42. Thus all the observed fission processes
are expected to produced by tunnelling through the potential barrier.
A fission process is initiated by bombarding a nucleus with neutrons, giving
an excited initial state such as 236U*, and forms the basis of nuclear fission
reactors used for tapping large nuclear energies.

Gamma Decay
A nucleus may be found in an excited state if it is one of the products in an
alpha or a beta decay. The excitation energy here is provided by the higher mass
of the decaying particle. Alternatively, the excitation may take place as a result
of a collision in which the projectile excites the nucleus by transferring to it
some of its kinetic energy. In either case, the excited nucleus usually undergoes
transition to a lower state by emitting a photon,
Z* → Z + γ (9.84)
An example of γ-decay is the de-excitation of 208
Tl discussed earlier. The
energy of the photon emitted is
hv ≈ E* – E (9.85)
i.e. the difference in the energies of the number states.
The lifetimes for γ-decay are usually of the order of 10–14 s. However,
selection rules may inhibit some of these decays leading to mush longer lifetimes
of the excited states are sufficiently long so as to be directly measurable, the
nuclei are known as isomers. An example of extreme isomerism is 91Nb* which
has a lifetime of about 87 days.

Radioactive Series
In nature one observes several decays of radioactive elements some of which
were created in the early stages of the evolution of the universe, others which
are continuously formed by the bombardment of cosmic rays (see Sec. 10.6).
Of special interest are the ones which form what are known as the radioactive
series.
All nuclei with A > 209 are unstable and normally undergo either α-decay,
β-decay, or γ-decay. Since the changes in the number of nucleons A, are due
only to α-decay, the mass numbers A of the series of nuclei produced in a series
are related by
344 Elements of Modern Physics

A = a + 4n (9.86)
Thus, four series exist corresponding to a = 1, 2, 3. These are summarized
below along with the half-life in years (τ1/2 ≈ 0.69 τav) of the longest-lived
member:

Series Mass number Longest lived Final stable


nucleus nucleus

Thorium 4n 232
Th (1.39 × 1010) 208
Pb
Neptunium 4n + 1 237
Np(2.25 × 106) 209
Bi
Uranium 4n + 2 238
U(4.51 × 109) 206
Pb
Actinium 4n + 3 235
U(7.07 × 186) 207
Pb

Of these, the thorium, uranium and actinium series are observed in nature.
The neptunium series is not observed in nature since its longest lived member,
237
Np has a half-life of about 2.25 × 106 years and whatever amount was created
at the early stages of the universe would have decayed by now. However, 237Np
can be artificially produced, e.g. from 236U by the capture of neutron followed
by β– decay. All these elements undergo a series of α and β– decays till they are
reduced t.o the final stable nucleus. The details of the 238U series are shown in
Fig. (9.7) where the steps with decrease in Z of two correspond to α-decays and
steps with unit increase in Z correspond to β-decays.

95
238
U

90
Z

85
206
Pb
80
120 125 130 135 140 145 150
A–Z

Fig. 9.7 Uranium radioactive series.


The occurrence of radioactive elements in nature provides a means of
determining the age of the elements in our planetary system. Assuming that
235
U and and 238U were initially created in about same quantities (this is
approximately true for many stable nuclei), the ratio of these elements found in
nature today is
The Nucleus 345

N 235 (t )
≈ exp [– t (1/τ235 – 1/τ238)] (9.87)
N 238 (t )
Using the observed value of about 1/140 for the relative abundance,
τ235 τ238
t≈ ln 140 (9.88)
τ238 – τ235

Since τ235 ≈ 1.02 × 109 years and τ238 ≈ 6.51 × 109 years, the age of the
elements is
t ≈ 5.98 × 109 years (9.89)
This is in reasonable agreement with the age determined from the abundances
of other nuclei. It is also of the same order as the age of the universe deduced
from the rate of expansion of the universe (see Sec. 11.5), and hence is an
important element in the understanding of the universe.

9.6 NUCLEAR REACTIONS


The properties of a nucleus can be studied by bombarding it with energetic
particles, and analysing the consequences of the collisions. If the collision leads
to a nuclear interaction, it gives what is known as a nuclear reaction.
Usually, the projectile is a light particle, it may be a neutron, a proton, a
deuteron, an alpha particle or a photon (recently there has been considerable
interest in heavy ions as projectiles). A typical collision between a light particle
a and a nucleus X, may produce a light particle b and a nucleus Y:
X+a→Y+b (9.90)
Such a two-body process is customarily written in the form X (a, b) Y. If the
particle b is the same as particle a, one has a scattering process. If the total
kinetic energy is unaltered in the collision, the scattering is elastic, whereas if
the total kinetic energy changes (generally decreases), the scattering is inelastic.
If b is different from a, we have a special case of nuclear reactions. All these
processes must satisfy certain conservation laws, such as charge conservation,
energy-momentum conservation, etc.
A nuclear reaction is usually accompanied by either a release or absorption
of kinetic energy. This is given by the difference between the total mass of the
particles before and after the collision. It is called the reaction energy or Q
value and is expressed as
Q= ∑ mi − ∑ m f (9.91)
i f
where mi and mf are the masses of the initial and final particles respectively,
normally expressed as rest energy in MeV. If Q is positive, energy is released in
346 Elements of Modern Physics

the reaction (endothermic) and if Q is negative , energy is absorbed in the reaction


(endothermic). A reaction cannot proceed unless the energy of the bombarding
particle is greater than a minimum value known as the threshold energy, equal
to | Q | for negative Q reactions and zero for positive Q reactions.
Two specific examples of historic importance are
14
N + α → 17O + p, 14N (α , p) 17O (9.92)
7
Li + p → α + α, 7Li (p, α ) 4He (9.93)
The first reaction was observed by Rutherford (1919), using α particles of
7.68 MeV energy. from a natural, radioactive source. It is an endothermic reaction
with Q ≈ – 1.2 MeV, with a threshold energy of about 1.2 MeV. The process in
Eq. (9.93) was the first nuclear reaction produced by artificially accelerated
particles. Cockcroft and Walton (1932) obserbed the reaction using protons
accelerated through a potential difference of about 0.5 MeV. This is an
exothermic reaction with Q ≈ 17.3 MeV and illustrates the possibility of
extracting energy from a nuclear reaction.

Cross-Section
The probability for a nuclear reaction to take place is expressed in terms of the
effective cross-section σ.
Consider a beam of projectiles incident on a collection of targets in the
form of a thin plate (sufficiently thin so that the nuclei do not overlap). An
effective cross-sectional area σ [Fig. 9.8 (a)] is associated with each target such
that every projectile incident within that area produces a reaction. If there are n
targets per unit volume and t is the thickness of the plate, the probability of a
single projectile producing a reaction is
P = σnt (9.94)
(it is the fraction of the effective target area). Therefore, the fraction of projectiles
∆N/N producing reactions, is
∆N
= σnt (9.95)
N
∆N
which leads to σ= (9.96)
Nnt
This relation allows us to determine the effective cross-section for a nuclear
reaction. It is usually expressed in terms of barns, 1 barn = 10–28 m2.
Nuclear reaction cross-sections vary over large ranges. Of particular interest
are the cross-sections for neutron projectiles. In this case, the neutrons, being
neutral, can enter a nucleus with ease even at low energies. In fact, since the
The Nucleus 347

1
time spent by the neutron inside the nucleus is proportional to ,v being the
v
1
neutron velocity, it is expected that the effective cross-section increases as at
v
low energies,
1
σ~
v
~ E–1/2 for small E (9.97)
This is observed experimentally. However, it is also observed that the
reaction cross-section has a sharp maximum for neutrons of some specific
energies E. This corresponds to what is known as resonance reaction [Fig. 9.8
(b)]. For example, the neutron capture cross-section, with Rh as a target, increases
by a factor of about 100 at an energy of about 1.4 eV. The resonance reaction
takes place when the energy of the incoming neutron equals the energy required
to transfer the combined system, of the target plus the neutron, to an excited
energy level. Similar resonant absorptions of photons are observed when the
photon energy is equal to the excitation energy of a nuclear level of the target.

10000

1000
r

s
100
s in barns

10

1
0.1 1 10 100
E in eV
(a) (b)

Fig. 9.8 (a) Illustration of cross-section, (b) resonance scattering


of neutrons by Rh, using log-log scale.

Compound Nucleus
A very useful concept in the understanding of nuclear reactions is that of
compound-nucleus formation proposed by Bohr (1936). It was suggested that
348 Elements of Modern Physics

for incident energies less than about 50 MeV, a nuclear reaction proceeds in
two stages. In the first stage, the projectile a is captured by the nucleus, forming
a compound nucleus c* which is usually an excited state of a nucleus. Such a
compound nucleus had a lifetime of about 10–18 s, considerably longer than the
time taken by the projectile to cross the nucleus (tcross ~ R /v ≈ 10–21 s). During
this time, the compound nucleus will have ‘forgotten’ the way it was formed. In
the second stage, this compound nucleus will decay independently of the mode
of its formation. The two stage process may be indicated by
X + a → C* → Y + b (9.98)
The corresponding cross-section is factorizable, i.e.
σ [X (a,b) Y] = F (X, a) G (Y, b) (9.99)
where F and G depend on the energy, spin, etc. of X, a and Y, b respectively. An
important result which follows immediately is that the ratio of the reaction
rates at a given energy, is independent of the initial state, i.e.
σ [ X ( a, b) Y ] G (Y , b)
= (9.100)
σ [ X (a, ′b) Y ′] G (Y ′, b′)
An example of the reactions proceeding via a compound nucleus is given
by the following processes:

 27 Al + γ
 23
p + 26 Mg   Na + α
25
 27  26
d + Mg  → Al * →  Al + n
  26 (9.101)
α + 23 Na   Mg + p
 25 Mg + n + p

In these reaction, the relative cross-sections for the five final states are
independent of the initial state, but depend on the energy of the system.

Direct Processes
Reactions produced by fast projectiles proceed without the formation of a
compound nucleus. They are known as direct processes. As a typical example,
consider a deuteron incident on a nucleus, say 50V. When it approaches the
target, one of the nucleons may be captured by the nucleus while the other may
continue essentially undisturbed along its path (it should be remembered that
the deuteron is a loosely bound object). The net reaction is represented by
50
V + d → 51V + p (9.102)
50
V + d → 51Cr + n (9.103)
The Nucleus 349

Such reactions are called stripping reactions and are observed for other
projectiles, such as α particles, as well. Conversely, a fast-moving neutron (or
proton) may collect one or more nucleons and move off as a deuteron or an
α particle, e.g.
Al + n → 24Na + α
27
(9.104)
Such processes are known as pick-up reactions.

Applications
Nuclear reactions provide a powerful tool for the investigation of nuclear
structure and properties. From the conservation of energy, the masses of the
nuclei can be deduced, while conservation of angular momentum and the angular
distribution of particles help us to determine their angular momenta. Furthermore,
the cross-sections give important information on the interaction between nuclei.
Nuclear reactions allow us to produce trans-uranic elements as well as
elementary particles. For example, plutonium (Z = 94) is produced by
bombarding uranium with 40 MeV α particles
U + a → 241 Pu + n
238
(9.105)
Similarly, other elements such as berkelium (Z = 95), californium (Z = 98),
einsteinium (Z = 99), fermium (Z = 100), nobelium (Z = 102), etc. have been
262
produced from nuclear reactions, the heaviest of them being hahnium 105 Ha.
Many elementary particles also have been produced and investigated in nuclear
reactions. Two examples are:
p + p → P + n + π+ (9.106)
p + p → p + Σ K°+
(9.107)
π+ and K° being the positively charged pi-meson and the neutral K-meson,
respectively, and Σ+ is the positively charged sigma baryon (see Sec. 10.1).
From the point of view of practical applications, products of nuclear reactions
are of considerable utility in industry, medicine, agriculture, etc.
Their uses may be grouped in the following broad categories:
1. Radioactive isotopes, which have the same chemical properties as a given
element but decay with the emission of photons, are of general use as
tracers. These isotopes are easily detected by their characteristic radiation
and half-life. For example, the phosphorus isotope 32P (decays via beta
decay with half-life of 14.2 days) may be used to determine the proper
application of fertilizers to plants. Radioactive tracers are also used for
analysing blood circulation, flow rates of fluids, etc.
2. Neutron activation analysis is used for detecting small amounts of
impurities. The impurities are activated by exposing them to neutron
beams. The radioactive isotope formed as a result is estimated by its
350 Elements of Modern Physics

characteristic radiation. For example, traces of cobalt may be detected


by observing the radiation from cobalt isotope 60Co formed on absorption
of neutrons.
3. In place of x-rays, the more energetic gamma rays from radioactive
isotopes may be used to detect flaws in metals, in medicine for the
treatment of some diseases such as cancer, for preservation of food
materials, for crop mutations in agriculture and in various industries.
The commonly used γ rays are from the radioactive 60Co.
A particularly interesting application is radioactive 14C dating based on a
nuclear reaction taking place in the atmosphere. Neutrons produced in the
atmosphere by cosmic rays, collide with nitrogen nuclei, producing 14C and
protons
14
N + n → 14C + p (9.108)
14
C is radioactive an decays by β-emission,
14
C → 14N + e + v (9.109)
with a half-life of 5730 years. This radioactive carbon is assimilated by plants
in photosynthesis, so that living organisms contain a small fraction of 14C. When
the organism dies, the intake of 14C stops stops and because of the decay of 14C,
its concentration relative to 12C begins to decrease. Hence, a measurement of
the concentration of 14C in the remains of the organism (e.g. wood, bones, etc.),
gives the data when it died. This is known as radioactive dating and permits us
to determine the age of organic relics which may be thousands of years old.
Two important applications of fission and fusion reaction, which lead to
the extraction of energy from nuclear processes, are considered in the next two
sections.

9.7 FISSION REACTORS


It was noted in Sec. 9.5, that it is energetically favourable [see Eq. (9.80)] for a
heavy nucleus to break into two nearly equal parts. Qualitatively, this is due to
the fast that the binding energy per nucleon is about 7.6 MeV for nuclei with
A ~ 230, whereas it is about 8.5 MeV for nuclei with A ~115. This means that
the fission of a nucleus with A ~ 230 into two equal parts would release an
energy of about 0.9 × 230 = 2078 MeV. However, the fission products must
pass through intermediate states of high energy provided by the Coulomb barrier,
similar to the situation in Fig. 9.6(b). The fission can take place by tunnelling
through the barrier, but with a rather long lifetime (for 238U it is about 1016
years).
The fission of a nucleus can be induced by bombarding the nucleus with
neutrons. Consider, for example, the capture of a slow neutron by a nucleus of
235
U. This leads to the formation of the compound nucleus 236U in which the
small kinetic energy and the binding energy of the neutron are released. The
The Nucleus 351

nucleus therefore is in an excited state and can decay by fission, generally into
parts with mass numbers of the order of 95 and 140. The neutron numbers in
these fragments are close to the magic numbers 50 and 82 which lead to stable
nuclear structures.
Since the heavy nuclei have an excess of neutrons, the fission process is
usually accompanied by the emission of neutrons. There is a further reduction
of the neutrons in the nuclei, due to either β-decay or emission of delayed
neutrons. A typical process would be
235
U + n → 236U* → 137I + 97Y + 2n (9.110)
followed by
97
Y → 97Zr + e + v
↓ 17h
97
Nb + e + v
↓ 74 min (9.111)
97
Mo + e + v
I → 137Xe + e + v
137

↓ 22 s
136
Xe + n (9.112)
137
I emits a neutron after its β-decay into 137Xe, and since β decays proceed
slowly, there is a large item delay between the fission and the emission of this
neutron. Such neutrons are called delayed neutrons.
The most important feature of neutron-induced fission is that the fission
itself provides additional neutrons which can produce additional reactions. Thus,
the process can build into a self-sustaining, possibly a growing, chain reaction
which is the basis of fission reactors. In the following, the main elements of a
fission reactor are discussed briefly.

Reactor Fuel
The essential process in a reactor is the fission of nuclei, accompanied by
neutrons and a release of energy. From a practical point of view, it would be
required that sufficient quantities of these nuclei, which form the reactor fuel,
occur naturally or can be produced. Five such nuclei are 235U, 239Pu, 233U which
contain odd number of neutrons, and 238U and 232Th which contain even number
of neutrons. The first three of these can undergo fission with the capture of a
thermal neutron, whereas that last two undergo fission mainly through the capture
of fast neutrons (the captured neutron in these two cases would be an odd neutron
and hence would release less energy). It is, however, important to note that the
capture of neutrons by 238U and 232Th leads to fissionable fuel:
352 Elements of Modern Physics
238
U + n → 239U + γ
↓ 23 min
239
Np + e + v (9.113)
↓ 2.3 days
239
Pu + e + v ,
232
Th + n → 233Th + γ
↓ 24 min
233
Pa + e + v (9.114)
↓ 27 days
233
U+e+ v
In most cases, the fuel is in the form of rods or plates which are placed in a
regular array within a moderator which serves the purpose of slowing down the
neutrons to thermal energies, i.e. energies of the order of 0.1 eV. The fission
reaction is triggered either by secondary cosmic ray neutrons i.e. neutrons
produced by the cosmic rays, or neutrons from a small neutron source (usually
containing a source of α particles which react with beryllium to produce
neutrons). The neutrons emitted, if suitably controlled, can then produce chain
reactions.
As a specific example, the source may be 235U which occurs in nature
(0.7% 235U along with 99.3%238U). It may be used in the natural form or after
concentration. One of the many known reactions produced was indicated in
Eq. (9.110). The fission of 235U produces, on the average 2.5 neutrons per nucleus
of which about 0.7% are delayed neutrons which play an important role in the
control of reactor rates. The energy released in each fission is about 200 MeV
which is distributed among the main fission fragments (about 165 MeV),
neutrons (about 5 MeV), electrons and photons (about 20 MeV), and neutrinos
(about 10 MeV).

Neutron Economy
In order that the fission process be self-sustaining, the neutrons produced in the
fission reactions should not all be lost.
In neutrons may be lost by being captured by 238U. The resulting 238U does
not lead to fission, but decays by emitting a photon. The capture cross-section
for 238U decreases to small values, about 3 barns, for thermal neutrons (note that
the cross-section goes through a large resonant value, 2.3 × 104 barns at 7 eV).
The capture cross-section for 235U, on the other hand, increases as 1/E1/2 for
small energies [see Eq. (9.97)], and has a value of about 580 barns for thermal
neutrons. Thus, the fission-effectiveness of neutrons is increased by the
thermalization of neutrons, achieved by the moderators surrounding the fuel.
Successive scattering of neutrons by the moderator transfers the neutron energy
to the moderator and hence slows down the neutrons.
The Nucleus 353

Some neutrons may be lost from the surface of the active zone. In this
connection, it is noted that the fission-effectiveness of the neutrons is proportional
to the volume (i.e. l3) whereas the surface losses are proportional to the surface
area (i.e. l2). Hence, the relative surface losses can be decreased by increasing
the active volume, and the size at which the chain reaction is just self-sustaining
is known as the critical volume, and the corresponding mass of the active material
is known as the critical mass.
In addition, there are other sources of neutron losses, such as absorption by
the impurity in the moderator. These losses must be carefully controlled, for
example, by sing moderators which are nearly free of impurities. The essential
requirement of continued chain reactions is that the number of fission neutrons
remaining after taking into account all the losses, must be greater than the initial
neutrons which induced the fission.

Moderators
The role of a moderator is to slow down the neutrons without absorbing them.
Elementary considerations show that maximum energy is transferred to the
target if the target mass is equal to the projectile mass.
The ideal moderator would have been hydrogen. Unfortunately, hydrogen
can capture a neutron according to the reaction p (n, γ) d. More suitable
moderators are the deuteron d (nucleus of deuterium), graphite (C) or beryllium
(Be). About 25 collisions are adequate to thermalize 2 MeV neutrons in heavy
water (D2O), and about 100 collisions in C or Be.

Control Rods
The number of fission neutrons available for the controlled chain reaction, after
taking into account the various losses, must be slightly greater than the neutrons
which caused the initial fission. To produce sustained, stable chain reactions,
the excess neutrons must be removed and controlled. This is usually done by
inserting what are known as control rods into the core of the reactor. These rods
are made of an element with a large cross-section for neutron capture.
The element often used in control rods is cadmium which has a very large
capture cross-section for thermal neutrons, 113Cd (n, γ) 114Cd. The insertion of
these rods decreases the reactivity of the reactor whereas withdrawal increases
the reactivity. It is important to observe that the response of the chain reaction
to fluctuations in neutrons, is slow, because of the delyed neutrons produced in
fission (see Example 6, Sec. 9.9). This permits the use of the control mechanism
with a time delay of about 1 min.

Coolant
The heat generated in the active region of the reactor is carried away by a heat-
carrying agent, usually water or an alkali metal such as sodium (the agent should
354 Elements of Modern Physics

have a large thermal capacity). This agent or the coolant gives this energy to
water (see Fig. 9.9) transforming it into steam which operates the steam turbines.

Breeder Reactors
So far mainly moderator-operated reactor bases on thermal neutrons has been
discussed. It is possible to run a reactor on fast neutrons by using a fuel with
enriched 235U or 239Pu. The fast neutrons released in the fission are used to
transform 238U into 239Pu (or 232Th into 233U) which can undergo fission by
capturing thermal neutrons. The core of such a reactor will contain two materials,
say, fissionable fuel 239Pu and the potential fuel 238U. If the conditions are such
as to produce more fuel than the amount burnt, the reactor is known as breeder
reactor (it breeds fissionable fuel).

Uncontrolled Chain Reactions


If two or more pieces of almost pure 235U (or 239Pu), each of sub-critical mass,
are brought together, the total mass may be over critical. In such a case, the
neutrons rapidly multiply and the energy release will be explosive, as in the
case of an atomic bomb.
Control rods

Steam to
steam turbine

Core

Water
in

Coolant

Pump

Fig. 9.9 Schematic diagram of a power reactor.


In an atomic bomb, the chain reaction can be triggered by secondary cosmic
ray neutrons. Here, the pieces must be kept together under pressure so that the
chain reaction can build up to explosive proportions. This is done by using
ordinary explosives to shoot one piece into another. Nevertheless, only a part of
the fuel has time to react before the final explosion.
The Nucleus 355

Nuclear reactors are used for producing power, producing fissionable


material, and to obtain strong neutron sources. The neutron sources may be
used for conducting scientific experiments and for producing radioactive isotopes
which are of enormous use in medicine, and in industry.

9.8 THERMONUCLEAR FUSION


It may be observed (Fig. 9.1) that the binding energy per nucleon is small for
light nuclei, and increases to maximum value for A ≈ 60. It is therefore
energetically preferable for lighter nuclei to fuse into larger nuclei. Such a process
would be accompanied by a release of energy, e.g. in the fusion of deuterium
and tritium,
H + 2H → p + 3H
2
(9.115)
2
H + 3H → n + 4He (9.116)
energies of 4.0 MeV and 17.6 MeV respectively, are released.
For a fusion reaction to take place, the lighter nuclei must overcome
Coulomb repulsion between them [see Fig. 9.6(b)], i.e. they must have an energy
Z1Z 2 e 2
Ec ≈ (9.117)
4πε0 (r1 + r2 )
which for Z1 = Z2 = 1 and r1 + r2 = 2 fm has a value of about 0.7 MeV. Thus, each
of the two nuclei must have an energy of about 0.35 MeV. A temperature of
about 2 × 109 K would provide an average thermal energy of about 0.3 MeV,
and hence promote the fusion reaction. However, the fusion reaction can proceed
even at lower temperatures. This is due to the fact that (i) the energies of the
nuclei at a given temperature have Maxwell-Boltzmann distribution so that
there are always some nuclei with enough energy to overcome the Coulomb
barrier, (ii) nuclei can tunnel through the potential barrier. Therefore, and
appreciable amount of fussion takes place at temperatures of about 107 K. Since
the reaction is induced by high temperatures, it is known as thermonuclear
fusion.

Controlled Fusion
To have controlled fusion reactions, it is necessary to maintain nuclei at a
temperature of about 107 – 108 K in a confined region, so that nuclear reactions
can take place. At such high temperature, the atoms are ionized into positively
charged ions and electrons, forming what is known as a plasma state. The two
main problems in achieving controlled fusion are the containment of the plasma
within a suitable volume, and the heating of the plasma to the required high
temperatures.
356 Elements of Modern Physics

For the confinement of the plasma, one cannot use the walls of any vessel.
Any contact with the wall will not only quickly cool the plasma but also cause
the wall to evaporate. What is usually done is to confine the plasma in a suitable
magnetic field. The nuclei spiral along the magnetic field lines. By a suitable
arrangement of the field, the nuclei are reflected back and for the between bottle
necks provided by the converging lines (the lines tend to converge in regions
where the magnetic field is stronger). Such an arrangement is called a mirror
machine. Alternatively, the plasma may be confined in toroidal region formed
by a solenoid bent in the form of a torus. In this case, the nuclei spiral along the
closed field lines inside the torus. However, there are as yet serious difficulties
in controlling the instabilities of confinement over appreciable time periods.
There are two important methods of heating a plasma. In one method, fast
neutral atoms are injected into the magnetically-confined system and are ionized
by collisions with the plasma. The energetic ions are now trapped by the magnetic
field for long enough to transfer their energy to the plasma by collisions. For
example, a plasma of H+ may be heated by a beam of energetic H, or a plasma
of D+ (nucleus of deuterium) and T+ (nucleus of tritium) by a beam of energetic
D (Deuterium). The beam energies are generally of the order of a few tens of
keV to several hundreds of keV. The energetic beams are usually produced by
accelerating low energy ions in an electrostatic field and then passing the ions
through a target gas where the ions capture electrons and are neutralized. The
other method of heating a plasma is by radio-frequency electromagnetic waves.
When the waves are incident of a plasma, under suitable conditions, their energy
is converted into ordered particle energy which is then thermalized by collisions.
An alternative approach to controlled fusion is through what is known as
inertial confinement. Here, the fusion fuel, e.g. mixture of deuterium and tritium,
in the form of a pellet, is imploded from all sides by energy sources such as
laser beams, high energy electron or ion beams. The intense compression
pressures and the high temperatures produced in the pellet may produce
conditions conductive to fusion (it is the particle interial which provides the
basis for confinement over the required period and hence the term inertial
confinement). The difficulties in this approach are the low efficiencies of laser
or other sources, and the need to produce stable symmetrical implosion.
For controlled fusion to be a meaningful source of energy, the output energy
must be more than the input energy. There are several technical difficulties
which remain in achieving the break-even point, such as instabilities in
confinement, inefficient heating, etc. As such, controlled fusion has not yet
been realized. When realized, it will provided a virtually inexhaustible source
of energy. Deuterium, which is suitable for a fusion reaction (ordinary hydrogen
has a very small cross-section for fusion, and hence is not suitable), is readily
available, 0.03% by mass of hydrogen in water being in the form of deuterium.
Furthermore, the fusion reactions have important advantages over other sources
The Nucleus 357

of energy, in that they have hardly any radioactivity problem, produce negligible
pollution, and their sources are widely distributed.

Uncontrolled Fusion
Uncontrolled fusion can be achieved by using an atom bomb whose explosion
produces temperatures of the order of 107 K. For example, such an atom bomb
can ignite a fuel of deuterium and tritium, leading to the fusion reaction in
Eq. (9.116). This is the source of energy in what is known as the hydrogen or
thermonuclear bomb.
Fusion reactions are the source of energy in the sun and the stars, inside
which temperatures are of the order of 107-108 K. The energy there is produced
in two ways. In the proton-proton cycle which is dominant at lower temperatures
(T ~ 107 K), the fusion of hydrogen takes place in the following steps:
H + 1H → 2H + e + v
1

H + 1H → 3He + γ
2
(9.118)
3
He + He → He + 2 H
3 4 1

The net release of energy in this sequence is about 25 MeV. Alternatively


the fusion may take place through the carbon cycle which becomes dominant at
higher temperature:
H + 12C → 13N + γ
1


13
C+ e +v
H + 13C → 14N + γ
1
(9.119)
H + 14N → 15O + γ
1


N+ e +v
15

H + 15N → 12C + 4He


1

The net result is the fusion of four hydrogen nuclei fusion into one helium
atom with 12C serving only as a catalyst.
It may be noted that the Coulomb potential barrier (Eq. 9.117) is larger for
nuclei with higher Z values so that it is more difficult for the fusion of heavier
nuclei to take place. However, when the temperatures of stellar interiors rise,
fusion of heavier nuclei begins to take place. In particular, there is helium burning,
4
He + 4He → 8Be
4
He + 8Be → 12C + γ (9.120)
producing carbon. At higher densities and temperatures, fusion of heavier
elements also takes place, ultimately leading to elements in the iron mass region
358 Elements of Modern Physics

(A = 56) where the binding energy per nucleon has a maximum. Here, exothermic
fusion reactions cease. Elements heavier than 56Fe may be produced by capture
of neutrons produced in some reactions, and subsequent β-decays. Thus,
elements upto and just beyond uranium are produced. Still heavier elements
have short lifetimes and if created would quickly decay either by α-emission or
fission.

9.9 EXAMPLES
Some examples to illustrate the properties and interactions of the nucleus are
considered here.

Example 1
In the early stages of the development of nuclear physics before the neutron
was discovered, one of the models of the nucleus considered was that it was
made up of A protons and (A – Z) electrons. There are several arguments against
this model.
An electron confined to a volume of nuclear dimensions would be highly
relativistic, and its energy would be estimated by the uncertainty principle to be

c
T ≈ pc ≈ (9.121)
∆x
≈ 100 MeV for ∆x ≈ 2 fm
The confinement of such energetic electrons would require the existence of
very strong forces for which there is no evidence (such potentials would also
create many electron-positron pairs which are not observed). A conclusive
evidence against the proton-electron model of the nucleus is that even A, odd
Z nuclei have integral spin. In the proton-electron picture, such a nucleus would
have A protons and A-Z electrons, i.e. the nucleus has an odd number of fermions,
and hence would be expected to have a half-integral spin. This is contrary to the
experimental observations, e.g. 14N has I = 1. Finally, the proton-electron picture
e
would imply the existence of nuclear magnetic moments of the order of
2me

e
whereas the observed moments are much smaller, of the order of .
2m p

Example 2
One can estimate the strength of the deuteron potential, by assuming the potential
to be a square well of depth V0 and radius a.
The Nucleus 359

The radial Schrödinger equation for the ground state with l = 0, is


 2  1 ∂ 2 ∂ψ  ( E − V )ψ
−  r  = (9.122)
2m  r 2 ∂r ∂r 

 1 
where m is the reduced mass  m ≈ m p  , V = 0 for r > a and V = – V0 for r ≤ a.
 2 
It is easy to show that
1  A1e − kr for r > a
ψ =  A sin αr for r < a (9.123)
2 2
where
1/2 1/2
 2m | E |   2m(V0 + E ) 
k=  2  ,α= 
    2 
Continuity of the wave function and the derivative of the wave function, at
r = a, give the condition
α cot αa = – k (9.124)
Since | E | is known to be small compared to V0, one has an approximate
relation
αa ≈ π/2 (9.125)
which implies V0 ≈ 23 MeV for a ≈ 2 fm. A better approximation would be

π k
αa ≈ + (9.126)
2 α
which yields a value of V0 » 33 MeV for | E | ~ 2 MeV. In any case, it is seen that
the interaction is quite strong but the deuteron is a shallow bound state, i.e.
(| E |/V0) << 1.

Example 3
As an application of the shell model, the spin and magnetic moments of 17O and
127
I are considered.
The odd nucleon in 17O is the ninth neutron which is in the d5/2 state.
Therefore, 17O has j = 5/2 and its magnetic moment [see Eq. (9.38)] is expected
e
to be about − (1.91). Experimentally j is found to be 5/2 and the magnetic
2m p

e
moment to be − (1.89).
2m p
360 Elements of Modern Physics

For 127I, the odd nucleon is the 53rd proton. Shell model would predict that
it is in the g7/2 state. However, one finds j = 5/2 for the nucleus. Assuming that
e
is in the d5/2 state (see Fig. 9.3), shell model predicts µ ≈ (4.79) whereas
2m p
e
the experimental value is (2.81). The two example, illustrate the usefulness
2m p
and limitations of the shell model.

Example 4
A mass spectrometer is used for determining the masses of nuclei. It is based on
the principle that a moving particle subjected to mutually perpendicular electric
and magnetic fields, which are also perpendicular to the velocity of the particle,
is undeviated if
qE + qv × B = 0 (9.127)
or v = | E |/ | B |. If such a particle is now subjected to a magnetic field, it moves
in a circle of radius
p
r= (9.128)
qB
where p is its momentum. Thus, from the knowledge of v and p, the mass of the
particle can be determined (provided q is known).

Example 5
It is after the case that only a small amount of the target is exposed to a beam of
particles. The reaction produces an unstable isotope which decays. It is of interest
to know the number of unstable nuclei remaining after an exposure to the beam
for time t.
If the target contains N nuclei, the number of reactions per second is
n = NσF (9.129)
where F is the flux of the beam, i.e. particles/m2/s. The net increase dP in the
isotope population over a period dt is
dP = Nσ F dt – λ P dt (9.130)
where λ is the probability for decay. The solution to this equation is
P(t) = Nσ F(1 – e– λt)/λ (9.131)
For example, consider 1 mg of Na exposed to a neutron beam of
23

flux 10 14/cm2 s. The cross section for the reaction 23Na(n, γ) 24Na is about 0.56
barns. Since 1/λ ≈ 21.7 h, and N ≈ 2.6 × 1019, the number of isotope nuclei is
The Nucleus 361

P(t) ≈ 1.15 × 1014 (1 – e– t/21.7) (9.132)


where t is in hours.

Example 6
The determination of the age of a sample by 14C dating is illustrated here.
Let M grams of a sample of organic carbon decay at the rate of r(t) per hour.
Then the number n(t) of 14C atoms is given by
1
r(t) = n(t ) (9.133)
τ
where the lifetime τ is about 7.242 × 107 hs (half-life is 0.6931 times τ). The
fraction of 14C in atmospheric carbon is 1.3 × 10–12, which implies that at the
beginning there are

 6.03 × 1023  −12


n(0) =   M × 1.3 × 10 (9.134)
 12 
Hence,
n(t ) − 3  r (t ) 
= 1.109 × 10   (9.135)
n(0) M 
which, with the help of the decay law in Eq. (9.60), leads to

 M 
t = τ ln 902 (9.136)
 r (t ) 
For example, if a sample of 1 g yields 300 decays/h, its age is t ≈ 9100
years.

Example 7
The delayed neutrons through small in number, play an important role in reactor
control.
Let there by N(t) neutrons at time t, and let τ0 ≈ 10– 2 s be the period of the
cycle between two fissions in the chain reaction. In the fission, 99.3% of N(t)
are multiplied by a factor of about 2.5 and the remaining 0.7%, though multiplied
by the same factor, are emitted later, say after time τ1 ≈ 9 s. Under the equilibrium
condition 1.5 N(t) would be lost so that once again N(t) is got back after time τ0.
Suppose now that the equilibrium condition is disturbed and that (1.5 – δ)
N(t) are lost (δ > 0). Then the number of neutrons at time t + τ0 is
N(t + τ0) = 2.5 (0.993) N(t) + 2.5(0.007) N(t – τ1)
– (1.5 – δ) N(t) (9.137)
362 Elements of Modern Physics

Using N(t + τ) ≈ N(t) + τN′(t),


(τ0 + 0.0175 τ1) N′(t) = (δ)N(t) (9.138)
which on integration yields

 δ(t − t0 ) 
N(t) = N (t0 ) exp   (9.139)
 τ0 + 0.0175 τ1 
Since δ is usually of the order of 1% or less and τ1 >> τ0, the delayed neutrons
(characterized by τ1) dominate the time variation and the time-scales of change
are of the order of 0.0175 τ1/δ ≈ 10 s – 1 min.

Example 8
An interesting mechanism of producing fusion is by screening the Coulomb
repulsion between the nuclei while they are being brought together. This can be
done by using a muonic hydrogen atom (bound state of a proton and a muon).
Since the muon is about 200 times heavier than the electron, its Bohr radius is
correspondingly smaller, about 0.25 × 10–12 m. Thus, the muon effectively screens
the proton charge, allowing it to approach another nucleus with greater ease.
For example, it may fuse with a deuteron to give 3He,
(µ–p) + 2H → µ– + 3He (9.140)
with a release of about 5.5 MeV. Since µ has a short lifetime (τ ≈ 2.2 × 10–6 s)

and is not easily produced, this does not appear to be an economical way of
producing fusion. However, it is being considered as a triggering mechanism to
start controlled fusion.

PROBLEMS
1. For a charge q distributed uniformly in a sphere of radius r, show that the

3  q2 
electrostatic energy is   . For a proton this has a value of about
5  4πε0 r 
0.86 MeV if r ≈ 1 fm. It is believed that the small proton-neutron mass
difference is of electromagnetic origin, which depends crucially on the
magnetic properties to give a heavier neutron.
2Z + 1 2Z + 1
2. Nuclei Z X , Z + 1Y are examples of mirror nuclei (which are obtained
by n ↔ p). Charge independence of the nuclear forces implies that the
mass difference between these nuclei is electromagnetic in origin. Using
the result of problem 1, obtain an expression for the mass difference of
mirror nuclei. Using r = 1.2 A1/3 fm, determine the mass difference between
11
B and 11C, 13C and 13N, 35Cl and 35Ar. Compare with the experimental
values of 2.8, 3.0 and 6.7 MeV respectively.
The Nucleus 363

3. Show that for an ellipsoid with semi-major axis a along the axis of rotation
and semi-minor axis b perpendicular to it, the quadrupole moment is
2 2
5 Z (a − b 2 ) . Estimate (a – b)/R for 176Lu from the information that its
quadrupole moment is about 8 × 10–28 m2.
4. Show that the density of nuclear matter is about 2.3 × 1017 kg/m3. This is
the type of density expected in a neutron star which may be regarded as
a giant nucleus of mass comparable to that of the sun.
5. Assuming that the deuteron is a bound state in the Yukawa potential
given in Eq. (9.25), and the energy of about 30 MeV is approximately
the value of the potential at r ≈ 1 fm, show that | gε0/e2 | ≈ 60. This gives
an idea of the strength of nuclear forces compared to the electromagnetic
forces.
6. The nucleus 121Sb has spin 5/2. What is its expected magnetic moment?
e
Compare the result with the observed value of 3.36 .
2m p
7. Deduce the spin and magnetic moment of 3He, 15N, 39K and 209Bi from
the simple shell model and compare with the experimental values of
j = 1/2, 1/2, 3/2, 9/2 and µ = – 2.13, – 0.28, 0.39, 4.1 in units of nuclear
magnetons, respectively.
8. Determine the moment of inertia of 234Th given that its lowest rotational
energy levels are at 0, 0.048, 0.16 MeV. Compare it with the moment of
inertia of the whole nucleus regarded as a rigid sphere. What can you
deduce? What is the next expected rotational level?
9. The rotational ground state of 237Np has I = 5/2. The observed excited
levels have energies 0.033, 0.060, 0.076, 0.103 and 0.159 MeV. Which
of these may be expected to belong to the rotational band?
10. Obtain the masses of 106Ru, 106Rh, 106Pd, 106Ag, and 106Cd, from the
semiempirical formula and discuss the stability of these nuclei against
β± decays and electron capture.
11. Obtain the masses of 65Ni, 65Cu and 65Zn from the semi-empirical formula
and discuss the stability against β±-decay and electron capture.
12. Estimate the Coulomb barrier for α emission by 238U. What is the energy
of the α particle and of 234Th in the process 238U → 234Th + 4He?
(mU – mTh – mHe ≈ 4.3 MeV).
13. The Q value for the α-decay of 213Po into 209Pb is 8.52 MeV. What is the
energy of the α particle in the transition between these states? If some
α particle come out with 7.60 MeV, what is the energy of the
corresponding excited state of Pb?
364 Elements of Modern Physics

14. The element 32P decays into 32S (ms ≈ 31.972072) mu including the mass
of electrons) by β– -emission. If the maximum kinetic energy of the
electron is 1.7 MeV, what is the mass of 32P?
15. The element 7Be decays by electron capture. If the masses of 7Be and 7
Li are 7.016930 and 7.016004 mu respectively, what is the energy and
momentum of the recoil nucleus?
16. In the thorium series, the initial nucleus is 238U and the final nucleus is
206
Pb. How many α particles are emitted by each uranium nucleus? How
many electrons are emitted by each nucleus? If the lifetime is 6.5 × 109
years, how much helium is released from 1 g of 238U in 1 year? A mineral
sample contains 206Pb and 238U in the ratio of 1 to 4. Assuming that the
Pb is from the decay of U, estimate the age of the sample.
17. For the reaction 6Li + n → 3H + 4He with thermal neutrons, determine
the kinetic energy of 4He. It is given that the Q value for the reaction is
4.78 MeV.
18. A beam of neutrons is incident on a piece of gold. Show that the intensity
of the beam as a function of the depth t of penetration is given by
I(t) = I(0) exp (– σnt) where σ is the capture cross-section and n is the
number of target nuclei per volume. If the emerging intersity is 74% of
the original intensity for t = 0.05 cm, what is the capture cross-section of
gold?
19. If an average energy of 200 MeV is released in the fission of each 235U
nucleus, how mush 235U is used in one day in a reactor operating at a
power of 50 MW?
20. There is evidence to believe the sum has been in the present stable
condition for the last 5 × 109 years. Assuming that the stable condition
implies that the change in the composition is less than 10%, argue that
nuclear fusion is the only feasible source of energy and that the sun is
likely to remain in the present condition for another 5 × 109 years. The
sun radiates an energy of about 4 × 1026 J/s, and its mass is about
2 × 1030 kg.
10
Elementary Particles

Structures of the Chapter


10.1 Elementary particles
10.2 Strong interaction
10.3 Electromagnetic interaction
10.4 Weak interaction
10.5 Unified approach
10.6 Production and detection of particles
10.7 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 365
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_10
366 Elements of Modern Physics

While considering the microscopic structure of matter, it is pertinent to ask


whether the different forms of matter have a common basis. Is it possible to
understand the various properties in terms of a few elementary particles with
prescribed rules for their interactions? If these constituent particles and their
interactions are analysed, then in principle, one can construct all the forms of
matter and explain their properties.
In this chapter, the present status of out understanding of the elementary
particles, their various interactions and the unification of these interactions that
is emerging are briefly discussed.

10.1 ELEMENTARY PARTICLES


What are the elementary particles? This is not always an easy question to answer.
The answer will in general depend upon the existing knowledge and the
calculational tools provided by the underlying theory, which allow us to explain
the properties of the composite objects. Therefore, it might happen that entities
that are regarded as elementary particles at some time can later be described as
composite particles as out knowledge and calculational techniques improve.
For example, atoms which were regarded as the building blocks in the nineteenth
century are now regarded as composites of electrons and a nucleus, and the
nucleus itself is regarded a composite of protons and neutrons (the nucleons).
The generally accepted ideas of elementary particles are presented here and
their implications discussed. The interesting feature of these ideas is that many
particles that are thought to be elementary, have not yet been observed, and
indeed may not be observable in principle.
To start with, there are leptons which appear in pairs:
È ve ˘ , È vm ˘ , È vt ˘
ÍÎ e ˙˚ ÍÎ m ˙˚ ÍÎ t ˙˚ (10.1)
In each pair, the first particle is a neutrino, which carries zero electric charge,
and which is associated with the corresponding negatively-charged lepton. So
far, there is no evidence that neutrinos ave nonzero mass (experimentally
mυve < 2 eV, mvµ < 0.170 MeV, mvτ < 15.5 MeV and mj ≈ 1777 MeV where mass
is expressed in terms of rest energy). The leptons are fermions with spin 1/2,
and the charged leptons have masses.
me = 0.511 MeV, mµ = 106 MeV, mτ ≈ 1777 MeV (10.2)
and carry a negative charge – e. These doublets in (10.1) are called the leptons
of the first, second and third generation , in the order of increasing mass. Every
particle in nature is accompanied by an antiparticle which has the same mass
but with all its other properties (such as the charge) being opposite to those of
the particle. Therefore, along with the leptons we have antileptons with the
Elementary Particles 367

same mass as the leptons but with other properties such as the charge, being
opposite to those of the leptons (antiparticles for fermions are the holes in the
negative-energy sea of Dirac discussed in Sec. 4.8).
Similar to the lepton doublets, there are doublets of quarks which have not
so far been observed, but which have proved extremely useful in describing the
properties of their composites, such as the proton, the neutron, etc. They are

 u  ,  cs  ,  t  (10.3)
d    b 
They have half-integral spin and have the properties shown in Table 10.1.
Since they have not been observed directly, their masses are inferred from the
masses of their composites, and are known as first, second and third generation
quark doublets, in the order of increasing masses. For reasons mentioned later,
each quark is supposed to come in three varieties, known as red, white and blue
quarks, equal mixtures of which are said to be colourless. As in the case of
leptons, one has antiquarks which have the same mass as the quarks, but all the
other properties such as charge, baryon number, etc. are opposite to those of the
quarks.
Table 10.1 Properties of quarks, charge in units of e, and masses in MeV

Quarks Charge Baryon Isospin I Strange- Inferred


number (Iz) ness mass
(MeV)
Up (u) 2/3 1/3 1/2 (1/2) 0 1.7–3.3 MeV
Down (d) –1/3 1/3 1/2 (–1/2) 0 4.1–5.8 MeV
Charm (c) 2/3 1/3 0(0) 0 1.27+0.07–0.09 GeV
Strange (s) –1/3 1/3 0(0) –1 101+29–21 MeV
Top (t) 2/3 1/3 0(0) 172.0±0.9±1.3 GeV
Bottom (b) –1/3 1/3 0(0) 4.61+0.18–0.06 GeV

Leptons and quarks interact with each other via the exchange of some bosons.
These bosons are gluons (not yet observed directly) which give rise to strong
forces between quarks binding them into observed strongly interacting particles
such as the proton, the pi-meson, etc., the photons which give rise to
electromagnetic forces, the W-bosons which give rise to weak-interaction forces,
and the gravitations which given rise to gravitational interaction forces. These
bosons are necessary for the description of the interaction between different
particles.
368 Elements of Modern Physics

The properties of the interactions and the classification of particles are


now considered.

10.2 STRONG INTERACTION


Quarks interact with each other strongly by exchanging gluons which are
supposed to be eight in number. The strength of this interaction called the strong
interaction is given by
αs ≈ 0.4 – 0.2 (10.4)
which may be compared with the corresponding strength of electromagnetic
interaction given by the fine structure constant α ≈ 1/137. As a result of this
interaction (one also postulates an additional confining potential), bound states
are obtained, which are the observed strongly interacting particles. They are
known as the hadrons, and two of the hadron sets are listed in Table 10.2,
namely the baryons with spin 1/2 (positive parity), and pseudoscalar mesons
with spin 0 (negative parity). Baryons are the bound states of three quarks
(baryon number and its conservation are introduced to explain the stability of
matter which is primarily made up of baryons) while mesons are the bound
states of a quark and an antiquark. The quark composition of these bound states
is shown in Table 10.2, where it should be understood that these states are
formed by taking appropriate combinations of the spin-states of the quarks.
Hadrons interact with each other through strong interaction of essentially unit
strength and with a short range of about 10–15 m.
Table 10.2 List of baryons (1/2)+, and mesons (0)–, along with their masses (in MeV),
charge, isospin, strangeness and their quark composition

Hadron Mass Charge I(Iz) Strange- Constituent


(MeV) ness quarks

P 938.2 1 1/2 (1/2) 0 uud


N 939.5 0 1/2 (–1/2) 0 udd

Λ 1116 0 0(0) –1 uds

Baryons Σ+ 1189 1 1(1) –1 uus


Σ0 1192 0 1(0) –1 uds
Σ –
1197 –1 1(– 1) –1 dds

Σ0 1315 0 1/2(1/2) –2 uss


Σ –
1321 –1 1/2(– 1/2) –2 dss
Elementary Particles 369

Hadron Mass Charge I(Iz) Strange- Constituent


(MeV) ness quarks

π+ 139.6 1 1(1) 0 ud
πθ 135.1 0 1(0) 0 u, u d d
π –
139.6 –1 1(–1) 0 du

Mesons K+ 494 1 1/2(1/2) 1 us


K 0
498 0 1/2(–1/2) 1 ds

K0 498 0 1/2(1/2) –1 sd

K– 494 –1 1/2(–1/2) –1 su

η 549 0 0 0 u u , dd , ss

Isospin Symmetry
The most striking feature to be noted in the properties of the hadrons is that
they come in multiplets with components of very nearly the same mass, e.g.
938.2 MeV for the proton and 939.5 MeV for the neutron. Such a property has
been noted for bound states in central potentials, where the states with different m
values but the same l value, have the same energy. It is therefore suggested that
we postulate and abstract space in which there is an abstract spin called isospin
I, and the different components of a multiplet are states with the same I but
different Iz. For example, P and N have I = 1/2 and Iz = 1/2, – 1/2 respectively.
The equality of the masses of the different components, would then follow
from the invariance of the interaction under rotations in the abstract isospin
space.
It may be observed in Table 10.2, that the different components of an isospin
multiplet differ only in their u, d components. Therefore, an equality of the
masses of the u and d quarks would imply an equality of the masses of the
components of each multiplet. Thus, if the interactions do not distinguish between
u and d quarks, it is suggested that the interactions are invariant under the
transformations
| u′〉 = x1 | u 〉 + x2 | d 〉
| d′ 〉 = y1 | u 〉 + y2 | d 〉 (10.5)
with the condition that | u′ 〉 and | d′ 〉 are orthonormal (the notation | u 〉, etc. is
used to designate the states), which implies
| x1 |2 + | x2 |2 = | y1 |2 + | y2 |2 = 1
x1y1* + x2y2* = 0 (10.6)
370 Elements of Modern Physics

One also imposes a phase condition for the determine of the matrix formed
by the coefficients xi and yi
x1y2 – x2y1 = 1 (10.7)
The linear transformations in Eq. (10.5), with the conditions in Eqs. (10.6)
and (10.7) define the group SU(2) (group of special unitary transformations in
2-dimensions) which is closely related to the usual 3-dimensional rotations.
Invariance under these transformations gives rise to SU(2) or isospin symmetry.
It allows the characterization of states by isospin I and its z-component. Iz (similar
to l and m in the case of ordinary rotations). Thus, (u,d) have I = 1/2 and
Iz = ± 1/2, (s) has I = 0 and Iz = 0, (P, N) have I = 1/2 and Iz = ± 1/2, (Σ+, Σ0, Σ–)
have I = 1 and Iz = 1, 0, –1, etc. Furthermore, these quantum numbers are
conserved in processes which involve only strong interaction.
For specific applications, the (P, N) system is considered which can have
I = 1 or 0. Designating the isospin states by | I, Iz〉,
| 1,1〉 = | PP〉 (10.8)
1
| 1, 0〉 = 1/2 (| PN 〉 + | NP 〉) (10.9)
2
| 1, –1〉 = | NN〉 (10.10)
1
| 0, 0〉 = 1/2 (| PN〉 – | NP〉) (10.11)
2
These relations follow from the usual quantum-mechanical rules for
combining two angular momenta (also see Problem 1). Isospin symmetry then
implies that the probability amplitudes T, which are essentially the probability
amplitudes for the processes, satisfy the relations
〈PP | T | PP〉 = 〈 NN| T |NN〉
1
= 〈PN + NP | T | PN + NP〉 (10.12)
2
Another useful application is obtained by noting that the deuteron D appears
in only one charge state and hence is assigned I = 0. Since the π– meson multiplet
has I = 1, the Dπ state is an I = 1 state. Conservation of isospin then gives the
result
1
〈 Dπ0 | T | PN〉 = 1/2〈Dπ+ | T | PP〉 (10.13)
2
Experimentally, this relation was verified at an energy of 340 MeV, to with
in a few per cent by Hildebrand (1953), which supports the general ideas of
isospin invariance in strong interaction.

SU(3) and Higher Symmetries


If the masses of the baryons in Table 10.2 are examined, it is observed that even
baryons with different I have approximately equal masses (the difference are
Elementary Particles 371

small compared to the baryonic masses). If these differences in masses are


ignored, an extra degeneracy exists between the states with different I, which is
equivalent to having the same mass for u, d and s quarks. This is reminiscent of
the accidental degeneracy of the hydrogenic levels where the states with different
l have the same energy. The interaction between the quarks is approximately
invariant under transformations which mix u, d and s. As in Eq. (10.5), these
transformations retain orthonormality and satisfy the phase condition for the
determinant. This invariance gives rise to what is called the SU (3) symmetry.
The multiplets of this symmetry which transform among themselves, have
dimensions 1, 8, 8, 10 for the three-quark system, 1 and 8 for the quark-antiquark
system etc. The observed multiplets are the baryon octet [shown in Fig. (10.1)],
the pseudoscalar meson octet, the baryon decuplet with spin 3/2 and positive
parity, etc. It should be mentioned that the symmetries not only imply the equality
of the masses of the components of each multiplet, but also relate many of their
other properties such as the magnetic dipole moment, the interaction strengths.
etc. With the addition of charmed, top and bottom quarks, the symmetry can be
extended to SU (6) symmetry, which implies symmetry under transformations
which involve all the six quarks.

1 N P
Y=S+B

o
– S +
0 S S
L

–1 X

X
0

–2
–2 –1 0 1 2
Iz

Fig. 10.1 The baryon octet shown in terms of Iz and Y = S + B


(Y is called the hypercharge).
It may be noted that processes which involve only strong interaction conserve
charge, Iz, I, strangeness, charm, etc. Indeed, it appears that even the constituently
quarks retain these properties. An interesting example of this was the discovery
of a group of spin 1 particles called Ψ, Ψ′, etc. which have masses greater than
3 × 103 MeV but very long lifetimes (about 104 times the lifetime of other
vector mesons such as the ρ, etc.). They have a net charm of zero, but are made
372 Elements of Modern Physics

up of c c (c has charm 1 while c has charm –1). Since each charmed quark
retains its charm these Ψ-mesons cannot decay into particles which do not contain
c or c as constituents which explains their long lifetimes (other particles which
contain charm are too heavy to provide decay channels).

10.3 ELECTROMAGNETIC INTERACTION


All charged particles (leptons, quarks and hadrons) interact electromagnetically.
Even neutral particles have this interaction because of their charge distribution,
for example, the neutron has zero charge but a fairly large magnetic dipole
moment. The electromagnetic interaction between particles is propagated by
the exchange of photons. An emission by a particle, of a photon of energy
E introduces an uncertainty in the energy of the particle. The uncertainty principle
implies that the system can remain in this state for a period of time
∆t ~ /E (10.14)
During this time the photon can travel a distance
x ~ c/E (10.15)
which essentially defines the range of the interaction. Since photons have zero
mass, they can have an indefinitely small energy which means that the
electromagnetic forces are long-range forces. The strength of these forces is
characterized by the fine structure constant α
α ≈ 1/37 (10.16)
which is much smaller than the strength of strong interactions characterised by
αs in Eq. (10.4).
The interaction between hadrons is dominated by the strong interaction,
with electromagnetic interaction providing small corrections. For example, the
proton an the neutron (as also π+ and π0, and other multiplets of a given isospin
multiplet) have slightly different masses which is generally attributed to the
difference in their electromagnetic interactions. Small differences due to
electromagnetic interaction are observed in the so-called mirror nuclei, which
have the same strong interactions but different charges. However, strong
interactions are of short range r0 ~ 10–15 m, so that the interaction between
hadrons at large distances is dominated by the electromagnetic forces, e.g.
Rutherford scattering.
The electromagnetic interaction comes into its own domain in the description
of the properties of charged leptons. The dominant interaction of the charged
leptons (which do not have strong interaction) is the electromagnetic interaction.
Fortunately, electromagnetic interactions of charged particles are well-defined
through a generalization of the minimal electromagnetic interaction introduced
by Eqs. (4.89) and (4.90). Furthermore, since the strength of the electromagnetic
Elementary Particles 373

interactions, given by α in Eq. (10.16), is quite small, results can be obtained in


powers of α using perturbation theory.
The predictions of the theory of electromagnetic interaction of leptons
(quantum electrodynamics), are quite impressive. For example, including the
effects of vacuum polarization, self-interaction and vertex correction, it is found
that the magnetic dipole moment of the electron comes out to be
e Ê a a2 ˆ
µ= Á 1 + - 0.328 ˜ (10.17)
2me Ë 2p p2 ¯
e
≈ 1.0011596
2me
which is in excellent agreement with the experimental observation of
e
µ = (1.001156 ± 0.000012) (10.18)
2me
Detailed calculations have also been made for the Lamp shift, that is, the
separation between the 2 2S1/2 and the 2 2P1/2 levels of the hydrogen atom. The
theoretical calculations yield
DE
v=
= (1.05720 ± 0.0002) × 109 s–1 (10.19)
h
which may be compared with the experimental observation of
v = (1.05777 ± 0.00010) × 109 s–1 (10.20)
Other processes which are accurately described by quantum electrodynamics
are:
(i) electron-electron scattering called the Moller scattering
(ii) electron-positron scattering called the Bhabha scattering
(iii) electron-positron going into muon and antimuon, etc. It is appropriate
to say that attempts to describe the electromagnetic interactions of hadrons
have been, at best, only partially successful.

10.4 WEAK INTERACTION


There are some processes observed in nature that cannot be described either
by strong interaction or by electromagnetic interaction of particles. An striking
example of such processes is the transmutation of a radioactive nucleus by the
emission of an electron. In this process, a nucleus of charge Z undergoes a
transition
n(Z) → n′ (Z + 1) + e + ve (10.21)
with the emission of an electron. The lifetime for many of these decays is of the
order of minutes, compared with the lifetime of order 10–22 s for decays involving
374 Elements of Modern Physics

only hadrons and 10–8 s for decays of excited atoms emitting electromagnetic
radiation. The interaction which causes these decays is called the weak
interaction. It was noticed in these decays that the electron does not carry away
all the energy (∆E ≈ mnc2 – mn′c2) but has a continuous energy distribution with
a cut-off in the energy equal to (mn – mn′)c2. It was also found experimentally
that no photons were emitted in the process.
In order to save the law of conservation of energy, Pauli made a bold
suggestion (1930) that an electrically neutral particle with spin 1/2, accompanies
the emission of the electron. This particle is the neutrino. It has a very small
mass, mv < 60 eV, possibly zero (theories prefer a zero mass for the neutrino),
and as in the case of other particles, there is an antineutrinio as well. Indeed,
the emission of an electron in Eq. (10.21) is accompanied by the emission of an
antineutron (a neutrino would accompany the emission of a positron). The
basic β-decay process of radioactive decay is
N → P + e + ve (10.22)
The neutrinos do not have direct strong or electromagnetic interaction. They
interact very weakly with matter (a neutrino of 1 MeV energy has a path length
of 1018 m in lead) and hence are very difficult to detect. However, nuclear reactors
provide intense beams of neutrinos (about 1017 m–2s–1), and were detected by
Reines and Cowan (1956) in the reaction
ve + P → N + e (10.23)
which is essentially the inverse β-decay process ( e is the positron). There are
other examples of reactions due to weak interaction in which neutrinos
accompany other leptons, e.g.
π+ → m + vm (10.24)
The beam of neutrinos (of energy about 500 MeV) from the decay of π+
produced in accelerators, was allowed to interact with neutrons (Lederman and
Schwartz, 1962) and produced reactions
vµ + N → P + µ (10.25)
but not vµ, + N → P + e. Thus, the neutrinos produced in reactions of the type
given in Eq. (10.24), accompanying muons, are different from those produced
in the β-decay. There are, therefore, two types of neutrinos, ve and vµ which are
associated with the electrons and the muons respectively. With the recent
discovery of τ leptons, there should also be vτ associated with τ leptons,
six different neutrinos along with the antineutrinos, there would be all together
six different neutrinos and antineutrinos.
Elementary Particles 375

Strangeness
The weak interaction plays an important role in the behaviour of what are known
as strange particles.
The K-mesons and Λ, Σ, Ξ baryons were discovered in the cosmic rays
(which are rays of generally high energy particles originating from the outer
space) in the early fifties. After the construction of high energy accelerators,
they could be produced and studied in a controlled manner. They are produced
in reactions of the type
π– + P → K0 + Λ (10.26)
The rate of their production is typical of strongly interacting particles (e.g
comparable to the production of π0 N). However, the decay of Λ,
Λ → π– + P or π0 + N (10.27)
is very slow. The lifetime of strange particles is generally of the order of 10–8 to
10–10 s (except for Σ0 which decays into Λ + γ in less than 10–14 s) whereas the
typical lifetimes of decays of strongly interacting particles are of the order of
10–22 s. The unusual behaviour of these particles, as strongly interacting particles
in production and as weakly interacting particles in decays, brought them the
name of strange particles.
It was observed that the strange particles are produced in pairs [K0 and Λ in
Eq. (10.26)], called associated production, whereas the decay processes involve
individual strange particles. This is reminiscent of a neutral system (e.g. radiation)
producing a pair of oppositely charged particles (e.g. e and e+) but a charged
particle being forbidden to decay into a neutral system by charge conservation.
Using this analogy, Gell-Mann and Nishijima introduced a new quantum number
S called the strangeness which is conserved in strong interaction. Thus K0 is
assigned strangeness 1 while Λ is assigned strangeness –1, and π– and p are
assigned strangeness zero. Thus, the total strangeness is conserved in the reaction
given in Eq. (10.26) (being zero both before and after the reaction). However, it
is not conserved in the decay process given in Eq. (10.27) and hence the decay
would be forbidden by strong interaction. Strangeness is conserved in
electomagnetic interactions as well, so that the decay in Eq. (10.27) proceeds
via the weak interaction which does not conserve strangeness. This would explain
the long lifetime of Λ. Indeed the strength of the interaction for the decay in
Eq. (10.27) is of the same order as the strength of the interaction which leads to
the β-decay of the neutron in Eq. (10.22), ones the dependence of the decay on
the masses is separated out. Thus, it is the weak interaction which governs the
strangeness-changing processes, e.g. decay of Λ.
The strangeness of a particle is given by the relation
1
Q= (S + B) + Iz (10.28)
2
376 Elements of Modern Physics

where Q is the charge, B is the baryon number, and Iz is the z-component of the
isotopic spin. The combination S + B is called the hypercharge Y and is often
more convenient to use than strangeness. This relation implies that the
conservation of strangeness is equivalent to the conservation of charge and Iz. It
needs a slight modification once particles with nonzero charm are included:
1
Q = (S + B + C) + Iz (10.29)
2
where C is the charm of the particle. The relation will require further modification
if top and bottom quarks are included.

Parity Violation
Weak interaction violates another important conservation low, namely the
conservation of parity.
It had been observed that the laws of nature generally do not appear to
distinguish between a coordinate frame and an inverted coordinate frame, i.e.
the equations of motion are the same whether coordinates (x, y, z) or
(–x, – y, –z) are used. This is termed as invariance under space inversion or
parity transformation. Let us define an operator P called the parity operator,
which takes the wave function ψ(x, y, z) in a coordinate frame, into a wave
function ψ′(x, y, z) observed for the same state but in a coordinate frame with
inverted axes:
ψ′(x, y, z) = Pψ(x, y, z) (10.29a)
However, ψ′(x, y, z) is essentially the same as ψ(–x, –y, –z) except for a
possible phase factor A, so that
ψ′(x, y, z) = piψ(–x, –y, –z) (10.30)
Now a second operation by P leads us back to the original wave function so
that
ψ(x, y, z) = Pψ′(x, y, z)
= pi2 ψ(x, y, z) (10.31)
which implies that
pi = ± 1 (10.32)
The states with pi = 1 are said to be even intrinsic parity states and the states
with pi = –1 are said to be odd intensity parity states. Furthermore, let us assume
that
ψ(–x, –y, –z) = pe ψ(x, y, z) (10.33)
where pe is called the spatial parity of the state. This relation together to pi, pe
Eqs. (10.30), (10.29) implies
Pψ(x, y, z) = pi pe ψ(x, y, z) (10.34)
so that there exist eigenstates of parity with eigenvalues equal to pi pe which are
the product of intrinsic parity and spatial parity eigenvalues.
Elementary Particles 377

Now, if nature does not distinguish between the coordinate frame and the
inverted coordinate frame, i.e. space inversion symmetry exists, then ψ′(x, y, z)
also must satisfy the Schrödinger equation

i Py ( x, y, z ) = HPψ (x, y, z) (10.35)
∂t
Since P2 = 1,
PHP = H (10.36)
or HP = PH (10.37)
Thus, P commutes with the Hamiltonian. Therefore, parity is conserved
and states which are simultaneous eigenstates of H and P can be obtained
(See Eq. (3.42) and the dissuasion which follows it). This is valid provided
nature exhibits space-inversion symmetry (which has been shown to be
equivalent to having HP = PH).
In the fifties, two mesons called the τ-meson and the θ-meson were
discovered in cosmic rays. They have the same mass, around 498 MeV, the
same lifetime, around 1.2 × 10–8 s, and the same production rates in nuclear
reactions of the type π+ N → Λ + (τ+ or θ+). However, they had different decay
modes: the τ-meson decayed into only two π-mesons (τ+ → 2π+ + π–) while the
θ-meson decayed into only two π-mesons (θ+→ π+ + π0). Now, the intrinsic
parity for a π-meson is –1 (as deduced from the strong interaction of the
π-mesons) so that pi = 1 for the two π−meson states. It was also shown by Dalitz
(1953) from the energy distribution of the pions, that pe = 1 for both two
π-meson final states. Thus, the decay products of the τ-meson decay are in a
negative parity state pi pe = –1, while the deacy products of the θ-meson decay
are in a positive parity state pi pe =1. If parity is conserved, an unusual situation
occurs, viz. that there are two particles with almost the same mass but opposite
party. The other possibility is that τ and θ are one and the same particle but the
weak interaction which is responsible for the decay (lifetimes indicate that the
interaction is weak), violates parity invariance, i.e. parity is not conserved in
weak interaction. Lee and Yang (1956) suggested this possibility after a critical
examination of processes involving weak interaction, and proposed an
experiment to test parity noncompensation in weak interaction.
Before describing the experiment, it is noted that space inversion is
equivalent to a reflection and a rotation, e.g. reflection in the xz plane
(change y → –y), and a rotation about the y-axis by 180° (changes x → – x and
z → –z). Since rotational invariance is a universal symmetry, it gives the result
that in addition to ψ′ (x, y, z) = Pψ (x, y, z) satisfying the Schrödinger equation,
the space-reflected wave function also satisfies the Schrödinger equation. e.g.
Rψ (x, y, z) where R denotes the operator which changes y → –y. This is known
as right-left symmetry. Consider. The nuclei of Co60 whose spins are aligned
along the z-direction with the aid of a magnetic field. It was then found that the
378 Elements of Modern Physics

β-decay electrons are preferentially emitted in the direction opposite to that of


the nuclear spins. The mirror reflection of this (Fig. 10.2) would show that the
electrons come out parallel to the nuclear spin. Since this reflected process is
different from the physically observed process, the experiment implies the
violation of parity invariance. The observation of parity violation in weak decays
resolves the puzzle of τ-θ decays, as being due to parity-violation weak decays
of a single particle, the K-meson.

– –
e e

(a) (b)

Fig. 10.2 Illustration of parity transformation in Co60 decay. If the electrons


come out antiparallel to spin as shown in (a), they come out parallel
to spin in the mirror reflection shown in (b).

V-A Theory of Weak Interaction


According to the quantum theory of electromagnetic radiation, the decay of an
excited atom accompanied by the emission of a photon is a spontaneous process,
i.e. the photon is produced at the moment of emission. In analogy, Fermi
postulated that in the β-decay, an electron antineutrino pair is spontaneously
produced at the moment of the emission. Furthermore, it was assumed that the
electron-anti-eutrino pair is produced in a state of unit angular momentum and
negative parity (some angular momentum and parity as that of the photon state
in atomic decays). Since negative parity objects with unit angular momentum
are called vectors (number of components is three in both the cases), Fermi’s
theory is a vector-theory.
The vector theory of Fermi conserves parity and had to be modified to
incorporate the observed parity violation. This was done in an elegant formulation
by Sudarshan and Marshak (1957) and also by Fenynman and Gell-Mann (1958).
This theory is called the V-A theory and it contains only those neutrinos which
have spin opposite to the direction of their motion (called left-handed neutrinos)
and only those antineutrinos which have spin parallel to the direction of their
motion (called right-handed antineutrinos). It is clear that parity invariance is
violated in this theory since under mirror reflection, a left-handed neutrino goes
into a right-handed neutrino (this can be seen from a diagram similar to Fig. 10.2,
with the neutrinos moving downwards) which is excluded from the theory.
Elementary Particles 379

The V-A theory brings in the possibility that the weak interaction is invariant
under the combined operation of spatial reflection and charge conjugation which
changes a particle into an antiparticle. As observed before, mirror reflection
takes a left-handed neutrino into a right-handed neutrino which under charge
conjugation goes into a right-handed antineutrino which is included in the theory.
However, even this invariance under CP, i.e. the combined operations of parity
and charge conjugation, is violated to a small extent. This was observed by
Christenson, Cronin, Fitch and Turaly (1964) in the decay of the long-lived
Component, Cronin, Fitch and Turlay (1964) in the decays into three π-meson
states with CP = –1, also decays to a small extent into two π-meson states with
CP = 1, indicating a small violation of CP invariance.

10.5 UNIFIED APPROACH


There have been efforts to unify the various interactions and deduce them as
different manifestations of the same underlying theory. Such a unification was
achieved for electric and magnetic forces by Maxwell (1864). Physicists have
now succeeded in unifying electromagnetic and weak forces and further efforts
are being made to bring in strong and gravitational forces as well. Here the
ideas that have led to the unification of electromagnetic and weak forces are
briefly described.
As was noted in the lest section, the electron-antineutrino pair in the
β-decay is produced in the same angular momentum state as the photon. This
similarity between weak and electromagnetic processes can be further extended
by postulating that the neutron changes into a proton by emitting a massive
W – (which carries spin  ) and the massive W – then decays into an
electron-antineutrino pair. The basic weak process here is a vector boson
W–interacting with a fermion pair similar to a photon interaction with a fermion
pair. The important differences are (i) the W– is charged and is accompanied by
its antiparticle the W+ whereas the photon is neutral and is its own antiparticle,
(ii) the W– is massive since weak interaction is a short-range force. From the
analysis of the weak processes, it is estimated that
mW > ~ 7.5 × 10 MeV
4
(10.38)
(iii) The interaction of the W violates parity invariance whereas the interaction
±

of the photon conserves parity.


In spite of the difference mentioned above, the interaction of the W bosons
and the photon with matter can be combined. This is done by allowing a triplet
and a singlet of vector bosons to interact with the lepton and quark doublets in
Eqs. (10.1) and (10.3) (the left-handed parts of charged leptons are a part of the
doublet and their right-handed parts form singlets). Demanding that the photon
does not interect directly with the neutrinos and that its interaction conserves
parity identifies the photon and relates the strengths of the weak and the
380 Elements of Modern Physics

electromagnetic interactions. This model developed by Weinberg (1967) and


Salam (1968) has the following interaction features:
1. There are three massive bosons W+, W–, Z (which is neutral), with masses
mw = 80.4 GeV
mz = 91.2 GeV (10.39)
and have parity violating interaction. It is large mass of these bosons which
suppresses the usual weak interaction. The photon has the usual electromagnetic
interaction. In high energy processes (energies comparable with mWc2) the two
interactions would become comparable. It may be noted that these bosons with
mW ≈ 8.04 × 104 MeV and mZ ≈ 9.12 × 104 MeV have been observed in a recent
experiment performed at CERN.
2. The neutrinos and leptons have another interaction with mater, in addition
to the electromagnetic and β-decay type of interactions. This interaction has
been observed (1973) in processes of the type
vµ + N → vµ + harrons
v µ + N → v µ + harrons (10.40)
where N is a nucleus. It has also been observed (1978) in the scattering of
polarized electrons by a deuteron target. It was found that there was a difference
in the scattering of left-handed and right-handed electrons, indicating violation
of the right-left symmetry or the violation of parity invariance in the interaction
of electrons. The amount of parity violation is in agreement with the prediction
of the Weinberg-Salam model.
With the successful unification of weak and electromagnetic interactions,
efforts have been directed towards unifying strong, weak and electromagnetic
interactions. In most of these theories, leptons and quarks are put in the same
multiplet. Therefore, there is the possibility of quarks transforming into leptons
which means that the protons would be unstable against decay into leptons.
Search for such decays (the lifetime of protons expected in these theories is
greater than 1030 years) is being pursued vigorously by different groups of
experimentalists.
Finally, there is the gravitational interaction which is generally insignificant
for interactions between elementary particles but becomes important in
astrophysics and cosmology. It is discussed in Chapter 11.

10.6 PRODUCTION AND DETECTION OF PARTICLES


The progress in our understanding of the properties of elementary particles and
their interactions, has been made possible by important advances in the
techniques of production and detection of particles. A few of them are briefly
discussed here.
Elementary Particles 381

Cosmic Rays
Before the development of powerful accelerators, the cosmis rays were the
only source of particles with sufficient energy to produce mesons and strange
baryons. Many particles such as the positrons, the µ-mesons, the π-mesons,
and several strange particles were first observed in the cosmic rays.
Primary cosmic rays are a flux of energetic charged particles, mainly protons
(about 89% protons, 9% helium nuclei, 1% remaining heavier elements and
about 1% electrons) that are incident on the earth. The energy of these particles
varies from about 103 MeV to 10–14 MeV(average energy is about 104 MeV).
When these energetic particles encounter the earth’s atmosphere, they undergo
inelastic collisions, producing what are called secondary cosmic rays which
consist of mesons, protons, neutrons, strange particles etc. These secondary
cosmic rays will themselves undergo additional elastic collisions, producing
nucleonic cascades. Ultimately they reach the ground with a composition of
about 80% µ-mesons, the remaining being protons, neutrons and some strange
particles.
The origin of the rays is thought to be supernova explosions, with additional
contributions from the sun (low energy cosmic rays), the centra of the galaxy,
etc. Some high energy particles may be from outside our galaxy.
While the cosmic rays have proved to be important for the discovery of
many particles, they have the disadvantage that neither their energy, nor their
intensity, can be monitored to our convenience.

Van de Graaff Generator


This was one of the
earliest generators (Fig. Charge collector
High voltage terminal
10.3). In this generator,
charges are sprayed
Insulating column
onto a cloth belt which
then transports the
Rotating belt
charges to a large metal
sphere. The charges
leave the belt by way of
fine metal points and
move to the outside
surface of the sphere.
The sphere then forms
one electrode of an Fig. 10.3 Schematic diagram of a van
accelerating tube in de Graaff generator.
which charged particles
382 Elements of Modern Physics

(such as the protons) can be accelerated. This accelerator can be used for
accelerating protons to energies of about 15 MeV and is very useful in low-
energy nuclear physics. The limitations of linear accelerators are in general due
to the length, instability and loss of voltage.

The Cyclotron
The cyclotron is based on the principle that charged particles (nonrelativistic)
in a constant magnetic field B perform circular motion whose frequency is
independent of the magnitude of the velocity. The frequency is obtained by the
force-acceleration relation
ev B = mv2/r (10.41)
eB
or ω= (10.42)
m
In a cyclotron, protons in spiral orbits (Fig. 10.4) between the poles of two
magnets and are accelerated by pulses across the hollow D-shaped electrodes
which enclose the particle chamber. The radio frequency pulse across the
electrodes has the frequency given by Eq. (10.42) and gives an extra energy of
eV for every traversal across the gap (V being the voltage across the electrodes).
The cyclotron can be used for obtaining protons of energy about 20 MeV (this
would require about 1000 pulses of V = 20 000 V). The limitations of the
cyclotron are due to the fact that the relativistic effects reduce the frequency in
Eq. (10.42) as the particle speeds up so that it is no longer independent of the
velocity of the particle.
Leads to the
alternating voltage

Dee electrodes

Spiral orbits of protons

Fig. 10.4 Schematic diagram of a cyclotron.

Synchrotron
To overcome the voltage pulses getting out of phase with the rotational frequency
in Eq. (10.42), the pulse frequency can be gradually changed so as to keep it in
Elementary Particles 383

step with the circulating particles. Machines based on this idea are called synchro-
cyclotrons. A further modification was to change the magnetic field as well as
the pulse frequency so as to keep the protons in circular orbits of approximately
the same radius (magnetic field must increase as the velocity of the protons
increases) and to keep the pulse frequency in step with the particles. Such an
accelerator is called the synchrotron, and is capable of providing proton beams
of an energy of a few GeV (GeV = 109 eV).
One of the problems of synchrotrons is the focussing of the particles. If
there is no focussing, particles with velocities slightly different from the average
velocity will spread out and only a few particles with the final energy will be
obtained. In velocity focussing the particles are kept together by adjusting the
timing of the pulses. In spatial focussing, the particles are kept together (though
they perform small oscillations) by controlling spatial variation of the magnetic
field. An important advance in the accelerators was introduced by what is known
as strong focussing. This was achieved by using magnet sections with alternating
magnetic field gradients—that is the magnetic field increases radially in one
section and decreases in the next sections. This allows one to obtain focussing
in both radial and axial directions and to reduce the radial and axial oscillations.
Synchrotrons using strong focussing are called alternating-gradient synchrotrons
(AGS) (see Fig. 10.5) and have been used to obtain protons of energies of about
30 GeV.
High-energy beams of electrons have been obtained by linear accelerators
(where the radiation loss in the energy due to radial acceleration, synchrotron
radiation, is avoided), and beams of photons have been obtained from the
synchrotron radiation of electrons, e– – e+ annihilation, etc.

Colliding Beams
For the production of heavy particles and the observation of interactions at
high energies, it is advantageous to work with colliding beams of energetic
particles. To see this, consider a collision between two particles of mass m
each. The effective energy for a process may be standardized in terms of the
total energy in the centre of mass (cm) system (Ptot = 0 in the cm system). If one
of the particles is at rest and the other is moving with an energy E (E includes
kinetic energy and rest energy) and momentum p, the total cm energy Et is
given by
1 2 1 2 2
E
2 t = 2 ( E + mc ) - p. p
c c
= 2m2c2 + 2Em (10.43)
384 Elements of Modern Physics
Injections of
protons from a To the target
linear accelerator

Proton beam

Magnets with
alternating gradients

Radio-frequency
acceleration regions

Fig 10.5 Schematic diagram of an alternating gradient synchrotron.


In obtaining this relation, we have equated the invariant scalar product p.p
[see Eq. (1.56)] for the total energy momentum four-vector, evaluated in the
cm system and the frame in which the target is at rest. If, on the other hand, the
two particles are moving in opposite direction with energy E each and momenta
p, – p, the total cm energy is given by
1 2 1 2
E
2 t = 2 (2 E )
c c
4
= 2 E2 (10.44)
c
In the CERN super proton synchrotron, the beam energy available for protons
and antiprotons is about 270 GeV. Since mc2 for the proton is about 1 GeV
Eqs. (10.43) and (10.44) give
Et ≈ 23 GeV, for target at rest,
Et ≈ 540 GeV, for colliding beams (10.45)
Thus, the colliding-beam facilities allow us to produce particles of mass up
to 540 GeV/c2 whereas with the target as rest, only practices mass up to 23 GeV\c2
can be produced.
The development of 270 GeV proton and antiproton beams at CERN has
generated considerable interest. Here, antiprotons produced in the collision of
protons of energy 26 GeV, with a target, are gathered in a ring shaped
accumulator. It must be appreciated that producing antiproton beams of sufficient
intensity (technically described in term of luminositi) is quite a difficult task
since every million collisions of the protons at this energy, produce only about
two antiprotons. In fact, the accumulation proceeds for a period of about 40 hours
before realizing a sufficient number of antiprotons. The antiprotons in the
accumulator are subjected to suitable electric fields so that most of them (about
60%) have an energy close to 3.5 GeV. After a beam of adequate luminosity has
been formed, the antiprotons are first speeded up in a proton synchrotron to an
Elementary Particles 385

energy of 26 GeV and then to an energy of about 270 GeV in a super proton
synchrotron. The same synchrotrons are used for speeding-up protons as well,
and providing a beam of 240 GeV protons. Since the protons and the antiprotons
are oppositely charged, they move in opposite directions in the synchrotrons.
It was in the collision of 270 collision of 2w70 GeV proton antiproton
beams at CERN that the W and Z bosons were produced recently; and identified
by their characteristic decays. These bosons, which are essential for the
propagation of unified weak-electromagnetic interaction, were found to have a
mass of mW ≈ 81 GeV and mZ ≈ 95 GeV. This has been an important step in the
confirmation of the theory of unified weak-electromagnetic interaction.
Finally, it may be noted that colliding-beam experiments have also been
performed with electron and positron beams, which have provided important
information about the properties of weak-electromagnetic interaction and of
elementary particles.

Scintillation Counters and Semiconductor Counters


Elementary particles are observed by the traces left by their electromagnetic
interaction with matter. In scintillation counters, charged particles passing
through the substance (known as a scintillator) of the counter, excite the atoms.
These atoms emit visible light on returning to their normal state. This light is
supplied to photomultipliers which then give information about the charged
particles.
In semiconductor counters, a charged particle passing through a junction
layer with an applied potential, gives rise to electrons and holes. The electric
pulse generated by them gives information about the number of charge carriers
and hence about the particle which created them.

Wilson Cloud Chamber and Diffusion Chamber


In the Wilson cloud chamber (1912), the track of a charged particle becomes
visible due to the condensation of the supersaturated vapour of a liquid, on the
ions formed in the track. The supersaturated condition is obtained by the sudden
expansion of a mixture consisting of a noncondensing gas (such as helium,
argon, nitrogen) and water vapour, ethyl alcohol, etc. The track produced by the
condensation of the vapour can be photo-graphed from different angles to
reproduce a three dimensional picture. It may be noted that the sensitivity of
the state of supersaturation lasts only for a short period so that the instrument is
operated in cycles.
The diffusion chamber works along the same lines as the cloud chamber
except that the state of super saturation is produced as a result of diffusions of
386 Elements of Modern Physics

alcohol vapour, from the top (kept at about 10°C) to the bottom (kept at
about – 70° C by using solid carbon dioxide). This gives a layer of supersaturated
vapour (a few centimeters thick) near the bottom. The diffusion chamber has
the advantage that it can work continuously.

Bubble Chamber and Spark Chamber


In the bubble chamber (Glaser, 1952), the supersaturated vapour is replaced by
a superheated liquid and the particle track is observed by the boiling of the
liquid along the path of the particle. The superheated condition is obtained by
first heating a liquid (such as hydrogen, xenon, propane, etc.) under pressure so
that it is near the boiling point at that pressure. A sudden lowering of pressure
produces a superheated state which survives for a short time. If high-energy,
ionizing particles pass through this liquid, bubbles are formed along the track,
on account of the electrons knocked out of the atoms by the charged particles,
and the track can be photographed. It is a very important feature of bubble
chambers that the working liquid itself can serve as a target for the charged
particles. A bubble chamber, like the cloud chamber, works in cycles, since the
superheated state lasts only for a short time.
In a spark chamber (Fig. 10.6), there is a series of plane, parallel, metal
electrodes, alternatively grounded or connected to a source of periodic, short,
high-voltage pulse (10-15 kV, lasting for about 10–7 s). A high-energy, ionizing
particle passing through the chamber will produce a chain of sparks between
the electrodes which can be analysed (an associated counter triggers the voltage
pulse when the particle passes through).

Counter
Charged
particle

Pulse
generator

Counter

Fig. 10.6 Schematic diagram of a spark chamber.


Elementary Particles 387

Emulsion Chamber
Charged particles interact with photographic emulsions in the same way as
photons, and hence the emulsions can be used for recording the tracks of charged
particles. Photographic emulsions such as silver bromide (which has a density
of about 4 g/cc) is an efficient stopper of high-energy ionizing particles. Several
hundreds of layers of these emulsion sheets, each about 1/2 mm thick, may be
exposed to cosmic rays or to high energy particles from accelerators. After the
development of the sheets, the path of the particles can be traced from layer to
layer.

Geiger Counter
This is a very compact and sturdy instrument used for detecting energetic
particles. It consists of a metal tube with a wire along its axis. It is filled with a
suitable gas mixture at low pressure. The tube is insulated from the wire and a
high potential difference is maintained between the tube and the wire. When an
energetic particle enters the tube, it produces some ions in the gas. This causes
a small discharge current to flow between the wire and the tube. Recording of
these signals allows us to count the number of incident particles. Since the ions
and the electrons take a time of about 10–3 s to recombine, the counter can
record only a few hundred counts per second. Its main advantage is that it is
very inexpensive and easy to operate.

10.7 EXAMPLES
A few examples are discussed in this section.

Example 1
It is interesting to note that several particles in the SU (3) multiplet scheme,
such as the η-meson, were discovered only after the scheme predicted the
existence of these particles. Particularly noteworthy is the Ω– which is a member
of the baryon decuplet. Gell-Mann predicted it to have spin of 3/2, strangeness
–3, and a mass of about 1676 MeV. Since there is no baryonic system with
strangeness –3, and which is lighter than 1676 MeV, the Ω– is stable against
decay through strong interaction (which conserves strangeness). The Ω– can
decay via weak interaction (which does not conserve strangeness) but would
have a lifetime of about 10–10 s characteristic of weak decays. A particle with
precisely these characteristics, with a mass of 1675 MeV was observed in 1964,
through its weak decay
Ω– → Ξ– + π0 (10.46)
which does not conserve strangeness.
388 Elements of Modern Physics

The Ω– is made of three strange quarks (sss). The three quarks are in the
ground state with spatial angular momentum equal to zero and in the totally
symmetric spin 3/2 state. This would violate Fermi statistics. In order to get
out of this difficulty, an additional property called colour was assigned to the
quarks. The three quarks in the baryons are supposed to be of different colours
so that the exchange statistics is not applicable.

Example 2
The β-decay in Eq. (10.21) is to be regarded as a spontaneous emission of e and
ve and not an escape of an electron bound in the nucleus. This is indicated by
the fact that a bound electron would have a momentum p ~ /r0 (follows from
the uncertainty principle) and a kinetic energy
KE ~ (m2c4 + c2  2 /r0 2)1/2 –mc2 (10.47)
which for r0 ~ 10–15 m comes out to be approximately 200 MeV. Such a highly-
energetic electron would quickly escape in about 10–23 s, and the slow rate of
the β-decay cannot be explained.

Example 3
The parity of π– is determined from the reaction
π– + d → N + N (10.48)
The low-energy π is captured is a Bohr orbit (with a radius about 2 ×

10–13 m) in the deuterium. The reaction in Eq. (10.48) proceeds with l = 0


(nuclear forces become effective only at a short range of about 10–15 m) so that
the initial total parity for the reaction is that of the intrinsic parity of the pion.
Now if the orbital angular momentum of the NN state had l = 0, the spin of
the NN system would be 1 (equal to the spin of the deuteron, by the conservation
of the total angular momentum). This implies that the spins of the two neutrons
would have to be parallel, and hence, this state is forbidden by Fermi statistics.
Therefore, the orbital angular momentum of the NN state has l = 1. This means
that the parity of the final state is –1 and therefore the intrinsic parity of
the π-meson is –1 (parity being conserved in strong interaction).

Example 4
The behaviour of neutral K-mesons provides a very interesting application of
the superposition principle. Neutral K-mesons have the decay modes
K0 → π+ + π–, π0 + π0
K 0 → π+ + π–, π0 + π0 (10.49)
Elementary Particles 389

where a lifetime of about 10–10 s, and


K0 → π+ + π– + π0, 3π0
K 0 → π + π + π , 3π (10.50)
+ – 0 0

with a lifetime of about 10 s. Now, weak interaction does not conserve


–7

strangeness so that the states with well-defined energy (and also lifetime) are
not expected to be states with well-defined strangeness.
Considering the possibility of CP being conserved in weak interaction, one
may define
CP | K0 〉 = | K 0 〉
CP | K 0 〉 = | K 0 〉 (10.51)
where P is the parity operator and C is the charge-conjugation operator (which
changes a particle into an antiparticle with opposite charge, strangeness, etc.).
If the Hamiltonian commutes with CP, the eigenstates of energy can also be
eigenstates of CP. Such eigenstates are the superpositions
1
| K10〉 = (| K 0〉 + | K 0〉 ), CP = 1 (10.52)
2
1
| K 20 〉 = (| K 0〉 − | K 0〉 ), CP = –1
(10.53)
2
and it is these states which are expected to have well-defined masses and
lifetimes.
In the 2π-meson decays given in Eq. (10.49) 2π-mesons have l = 0 and
hence are in CP = 1 state. Hence while the | K10 〉 can decay into two π-mesons,
the decay of | K 20 〉 into two π-mesons is forbidden. Thus | K10 〉 has a short
lifetime of about 0.86 × 10–10 s whereas | K 20 〉 has a much longer lifetime of
about 5.4 × 10–8 s (the masses of | K10 〉 and | K 20 〉 , also are slightly different).
This leads to some interesting observations for the |K0 〉 created in a process
such as Eq. (10.26) . The |K0〉 may be regarded as
1
|K0 〉 = (| K10 〉 + | K 20 〉 ) (10.54)
2
of which | K10 〉 decays quickly into two π-mesons [giving the decays in
Eq. (10.49)] so that after a few lifetimes of | K10 〉 , only the | K 20 〉 is left. This
long-lived component decays slowly into other modes such as the ones in
Eq. (10.50). It may be observed that though only | K0 〉 existed at first, the
component | K 20 〉 which remains at the end contains equal mixtures of |K0 〉
and | K 0 〉 .
390 Elements of Modern Physics

Finally, it is noted that the long-lived component | K 20 〉 was found


(Christenson, Cronin, Fitch and Turlay, 1964) to decay to a small extent into
two π-mesons which means that CP is not conserved and that | K10 〉 and | K 20 〉
are only approximate eigenstates of the Hamiltonian.

PROBLEMS

1. Using the operators in Eq. (4.127), except the factor of  , for the isospin
operators for I == 1/2, and the representation in Eq. (4.126) for the P and
N states respectively, obtain the isospin values of the states for the proton-
neutron system in Eqs. (10.8) to (10.11)
2. If a positron of energy E annihilates an electron at rest, giving out two
photons, e+ + e → γ + γ, obtain the angular distribution of energy.
3. Determine the maximum energy of π– when the K+ at rest decays into
π+ π+ π–.
4. What is the lifetime of Σ0 so much smaller than that of Σ+ or Σ–?
5. What is the range of weak-interaction forces originating from the exchange
of W-bosons of mass about 80 GeV?
6. Obtain from dimensional arguments, the lifetime of β-decay, given that it
2
Ê mw2 ˆ
is proportional to Á ˜ and that the remaining factor is a function of
Ë a ¯
 , c and me.
7. For a particle moving with high velocity, the force-acceleration relation
in Eq. (10.41) changes to evB = mv2/r (1 – v2/c2)1/2. What os the field
needed to keep a proton of 5 GeV energy, in a circular orbit of radius
20 m?
11
General Relativity and Cosmology

Structures of the Chapter


11.1 Frames of reference
11.2 Curved space-time
11.3 Schwarzschild metric
11.4 Kinematics of the universe
11.5 Dynamics of the universe
11.6 The early universe
11.7 Examples
Problems

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 391
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7_11
392 Elements of Modern Physics

Our discussion so far has been about modern ideas in the domain of high velocity
(special theory of relativity), and small distances (quantum theory and its
applications). There have been important developments in our understanding
of large-distance, large-body phenomena as well. These refer primarily to the
general theory of relativity as applied to large stars, galaxies and cosmology,
i.e. the science of the universe as a whole.
In cosmology, the events taking place in distant objects such as galaxies,
which not only may be moving with high velocity with respect to us but may
also be accelerating, have to be interpreted. It is to be expected that observation
and interpretation are simpler in the galaxy where the events are taking place.
Therefore, a relativistic theory, which can relate observations in frames which
are in arbitrary motion with respect to each other is needed. Einstein’s theory of
general relativity provides the frame-work for relating observations of space-
time events in arbitrary frames. Indeed, the theory accomplishes more than
that. It incorporates gravitational effects as well. This is based on the observation
that the motion of a particle in a gravitational field, with a given initial position
and velocity, is independent of its mass. This implies, as will be discussed later,
that the effect of gravity can be locally simulated by an accelerating frame but
without gravity, so that gravitational effects can be described by the theory of
general relativity. Furthermore, in the discussion of cosmological dynamics, it
is mainly the long-range gravitational forces which are important. Therefore
general relativity is also an appropriate theory for the analysis of the development
of the universe.
Here, an elementary and brief consideration of the main ideas of the theory
of general relativity is presented and the ideas are applied to discuss some
predictions of the dynamical properties of the universe. It is quite appropriate
to end with the description of general relativity, a theory which is grand in its
concepts and structure and awe-inspiring in its predictions. An exposition of
modern ideas in physics cannot be said to be complete without a discussion of
general relativity, and any thing to follow would only be an anticlimax.

11.1 FRAMES OF REFERENCE


An event is a space-time occurrence. To specify an event, we need a frame of
reference, which consists of three spatial coordinate axes and a time coordinate.
It is known that the equations of motion in special relativity or Newton’s theory
take the simplest form in inertial frames. In an inertial frame, a particle on
which no external forces act moves with constant velocity. A description of
motion in frames which accelerate with respect to inertial frames is complicated
by the need to introduce additional forces known as inertial (or pseudo) forces.
For example, an observer in an accelerating or decelerating train experiences
forces in addition to the gravitational or electromagnetic forces. Thus, with
General Relativity and Cosmology 393

respect to accelerations, there appears to be an absolute or a preferred frame of


reference in which the inertial forces are zero.

Mach’s Principle
The inertial frames may be determined by considering the inertial forces on the
surface of the earth. For example, the rate of rotation of the earth with respect to
the inertial frames may be estimated by the measurement of the centrifugal and
coriolis forces on the earth. The rate of rotation thus obtained is found to be
approximately the same as the rate of the earth’s rotation with respect to distant
matter, e.g. the distant galaxies. This leads to an important result that the average
motion of distant galaxies with respect to the inertial frame is zero.
According to Mach (1872), the above result is not an accident. It suggests
that the inertial frame is not an absolute frame but is related to the distribution
of matter in the universe. In fact, Mach asserted that the concept of inertia (and
the inertial frame) can be given meaning only in terms of background stars and
galaxies. This is known as Mach’s principle. It implies that if there were no
matter in the universe except for a given body, there would be no inertia or
inertial forces and it is meaningless to ask whether it is accelerating with respect
to an inertial frame.
A theory of the universe which incorporates Mach’s principle cannot include
an inertial frame without reference to the distribution of matter in the universe.

Principle of Equivalence
An important input of Einstein’s theory of general relativity is the observation
that the effect of gravity can be simulated locally by a noninertial frame.
Consider the motion of an object of inertial mass mI, in a region where the
gravitational acceleration g is approximately constant. If there is a
nongravitational force F acting on it, its motion is given by
mIa = mgg + F (11.1)
where the gravitational mass mg, in principle, may be different from the inertial
mass mI. Alternatively, consider a frame of reference without a gravitational
force, but moving with acceleration – g. If the acceleration of mass mI in this
frame is a, the corresponding acceleration in the inertial frame is a – g so that
the equation of motion is
mI(a – g) = F (11.2)
or mIa = mIg + F (11.3)
Experimentally is observed to a high accuracy (about 1 part in 1011) that
mI = mg, so that Eqs. (11.1) and (11.3) describe the same motion. This result
was generalized by Einstein into what is known as the principle of equivalence:
394 Elements of Modern Physics

The physical laws are locally the same in an inertial frame with gravitational
acceleration g and a noninertial frame with acceleration – g but no gravity.
It is important to note that the equivalence is local since the gravitational
field tends to zero at large distances whereas the inertial forces in general do
not vanish at infinity.
The principle of equivalence gives special importance to freely-falling
frames. In these frames, the local effect of gravity is cancelled by the inertial
forces so that they form local inertial frames and the considerations of special
relativity suffice to describe the physical observations in them. They allow us
to deduce two interesting results without going into additional details of the
general theory.
1. Bending of light in a gravitational field: Consider a beam of light in a
gravitational field. Observed from a freely-falling frame which is locally
inertial, the beam travels in a straight line. However, the frame with the
gravitational field moves with acceleration – g with respect to the freely-
falling frame so that in this frame the beam appears to bend in the direction
of g. This is a remarkable result since just the finiteness of the velocity
of light implies that light interacts with gravitational fields. The bending
of light in a gravitational field is observed in the deflection of light from
the stars, moving past the sun, seen during total solar eclipses. However,
since g is not constant along the path of the beam, quantitative calculations
are rather complicated. It may be noted that the predictions of general
theory agree with the observations within experimental accuracy.
2. Gravitational shift of spectral lines: Consider a photon emitted at t = 0,
and moving in a direction opposite to the gravitational acceleration g.
Observed from a freely-falling frame which is at rest with respect to the
source at t = 0, the frequency of the photon is v0. If the photon meets the
frame at t, it will have travelled a distance h = ct during this time and the
frame at that instant will be moving with a velocity of gt. Hence the
frequency v of the photon observed at a height of h, from a frame at rest
with respect to the source (and moving with velocity – gt with respect to
the freely-falling frame), is given by
v0 − v gt

v0 c
gh
= (11.4)
c2
This result, which can also be deduced from energy conservation (see
Example 2), has been verified by Pound and Rebka (1960) using Mossbauer
effect. They found that a photon falling through a height of 22.6 m shows a
General Relativity and Cosmology 395

shift of ∆λ/λ0 ≈ – gh/c2 ≈ – (2.57 ± 0.26) × 10–15 compared with the predicted
value of – 2.46 × 10–15.
The theory of general relativity is based on the principle of equivalence and
includes the gravitational effects in terms of the geometry of the space. The
theory makes no distinction between gravitational and inertial effects, both being
related to the energy and momentum distribution of matter and hence
incorporates Mach’s principle to some extent (there are some difficulties is the
interpretation of boundary conditions in the case of an infinite universe).

11.2 CURVED SPACE-TIME


In the theory of general relativity, the inertial and gravitational effects are
described by the geometry of the space. These effects modify the space properties,
e.g. the surface may become curved, which affect the dynamics of objects.
Some of the ideas involved can be illustrated by the following simple example.

Metric Tensor of the Space


Event in inertial frames F and F′ are related by the Lorentz transformations
1
x′ = ( x − vt ), y′ = y, z′ = z (11.5)
(1 − v /c 2 )1/2
2

1  v 
t′ = t − 2 x
(1 − v /c 2 )1/2
2
 c 
Trajectories may be characterized by an invariant variable τ, the proper
time, which gives the invariant interval between two events as
1
(∆τ)2 = (∆t)2 −2
( ∆r ) 2 (11.6)
c
Now, a gravitational field with acceleration a in the x-direction can be
introduced by going over to a frame with acceleration – a. The required
transformations are given approximately by
1 2
r′ = r + at , t′ = t (11.7)
2
in terms of which the infinitesimal proper time interval is

 a 2 t ′2  2 1 2 2t ′
(∆τ)2 =  1 − 2  (∆t ′) − 2 (∆r′) + 2 a . ∆r′ ∆t′
 c  c c
(11.8)
396 Elements of Modern Physics

Thus, a gravitational (or inertial effect is described by a more complicated


measure or metric of ∆τ in terms of ∆t′, ∆x′, ∆y′, ∆z′.
The expression of the measure, for a frame in arbitrary motion, can be
stated in a compact form by writing
(∆τ)2 = gµv ∆xµ ∆xv (11.9)
where gµv is called the metric tensor of the space. From here onwards, the
convention that repeated indices are summed over is used, in this case, µ, v = 0,
1, 2, 3 with index 0 standing for the time coordinate and 1, 2, 3 for the three
space coordinates. For the inertial frames one has

0 0 0 1 0
0
g 00 = 1, g11 = g 22 = g33 = − , gµv = 0 for µ ≠ v (11.10)
c2
corresponding to flat space. If the coordinates in the second frame are given by
the functional relation xµ = xµ (x′),

∂xµ
∆xµ = ∆x′v (11.11)
∂x′v
and
(∆τ)2 = gµv ∆x′µ ∆x′v (11.12)
α β
0 ∂x ∂x
where gµv = g αβ (11.13)
∂x′µ ∂x′v
It should, however, be noted that the metric tensor here is given in terms of
only four independent function xµ (x′), though an arbitrary but symmetric gµv(x′)
consists of 10 independent functions. The restricted gµv given in Eq. (11.13) can
describe the local gravitational field. Einstein the postulated that the general
gravitational field is described by an arbitrary metric gµv (x) with 10 independent
functions. Having defined the means of describing the gravitational and inertial
effects, one must now provide (t) the framework for the determination of the
metric gµv (x) and (ii) the dynamical equations for a given metric.

Elnstein’s Field Equations


The metric tensor gµv (x) is related to the distribution of matter, through the field
equations

1 8πG
Rµv − gµv R = − 4 Tµv (11.14)
2 c
where Tµv is the energy-momentum tensor which acts as the source and G = 6.67
× 10– 11 N . m2/kg2 is the usual gravitational constant. The quantities Rµv and
R are related to the Riemann-Christoffel curvature tensor Rµαvβ which in turn is
General Relativity and Cosmology 397

related to the metric gmv. The discussion of these relations is beyond the scope
of this book (interested reader may refer to Ref. 11). We only note that these
field equation determines the metric for a given energy-momentum distribution.
In the limiting case of weak gravitational field φ, one has T00 = ρc2, g00 =
2φ 1 1 2
1+ , R00 = − R = − 2 ∇ φ with which Eq. (11.14) to first order in φ,
c 2 2 c
reduces to the Poisson equation for Newton’s gravitational potential, ∇2f = 4πGρ.

Geodesics
The path followed by a particle in the presence of gravitational forces is
determined by the geometry of the equivalent metric space. To obtain the relation
between the path and the metric, it is noted that in 3-dimensional Euclidean
space, a particle moves in a straight line, i.e. it chooses a path which corresponds
to the shortest distance between any two points. However, the ordinary length
is not invariant even in special relativity, and is not a suitable quantity for
determining the trajectory in the general case.
The proper time τ, defined in Eqs. (11.6) and (11.12), is invariant under
general transformations, and the allowed paths may be regarded as corresponding
to extrema of τ. It turns out that τ is actually a maximum for the allowed trajectory
in the case of special relativity [because of the negative sign of spatial terms in
Eq. (11.10)]. For example, the proper time corresponding to points (0, 0) and
(t, 0), with metric gµαv (Eq. 11.10), is
τ1 = t (11.15)
whereas that corresponding to two segments (0, 0) → (t1, x1) and (t1, x1) → (t, 0)
is
1/2 1/2
 1   1 
τ2 =  t12 − 2 x12  +  (t − t1 ) 2 − 2 x12  (11.16)
 c   c 
which for x1 ≠ 0 is less than t (note that the intermediate point has to be taken so
that each proper-time interval is real). Thus, the straight line between (0, 0) and
(t, 0) corresponds to the maximum proper time. The general situation may be
covered by the requirement that the total proper time
B

τAB = ∫
A
dτ (11.17)

B 1/2
 dx µ dx v 
= ∫
A
 gµv
 ds ds 
 ds
398 Elements of Modern Physics

is an extremum, where Eq. (11.12) has been used, and s is an arbitrary parameter.
The extremum paths are called geodesics.
The integral condition can be converted into a set of differential equations
in the following way. To be specific, let s be the proper time of the geodesic,
with values τA and τB at the end points. Now consider a set of curves xµ (τ, ε)
which connect points A and B, such that
xµ (τ, ε) = xµ (τ, 0) + εhµ (τ) (11.18)
where x (τ, 0) is the geodesic needed, h (τA) = h (τB) = 0, and ε is a small
µ µ µ

parameter. Then the proper time is


τB 1/2
 dx µ dx v 
τ(ε) = ∫
τA
g
 µv
 dτ dτ 
 dτ (11.19)

τB
 dxµ µ 
≡ ∫
τA
f 
 dτ
, x  dτ

To first order in ε, this expression reduces to

 
τB  µ 
 ∂f dh + ∂f hµ  d τ
τ(ε) = τ(0) + ε ∫   dxµ  d τ ∂xµ  (11.20)
τA   
  d τ  
which on integration by parts (and using hµ (τA) = hµ (τB) = 0) gives
 
 τB 
d ∂f ∂f  µ
τ(ε) = τ(0) − ε ∫ −
 d τ  dx µ  ∂xµ 
h dτ (11.21)
τA  ∂  
  dτ  
Since τ(ε) is an extremum at ε = 0, the second term should vanish for all
hµ(τ) which implies that
d ∂f ∂f
− =0 (11.22)
dτ  dx µ
∂xµ
∂ 
 dτ 
Equation (11.22) gives four equations, for µ = 0, 1, 2, 3. One of these can
be replaced by using the relation
dxµ dx v
gµv =1 (11.23)
dτ dτ
General Relativity and Cosmology 399

∂f
which follows from Eq. (11.12). For the special case of = 0, Eq. (11.22)
∂xµ
simplifies to
∂f
= constant. (11.24)
 dxµ 
∂ 
 dτ 
These differential equations, i.e. Eq. (11.22) or Eq. (11.24), with Eq. (11.23),
determine the geodesics and hence the dynamics of a particle.
As an illustration, the metric tensor in Eq. (11.8) is considered for the case
of acceleration a in the x-direction,

a 2t 2 1 at
g00 = 1 −2
, g11 = − 2 , g 01 = g10 = 2 (11.25)
c c c
which is equivalent to a space with gravitational field characterized by
acceleration a. The corresponding function f in 2-dimensions is
1/2
 a 2t 2   dt 
2
1  dx 
2
2at dt dx 
f =  1 − 2    − 2   + 2 
 c   dτ  c  dτ  c d τ d τ 
(11.26)
Using Eq. (11.23) and Eq. (11.24) for µ = 1,

dx dt
− + at =A (11.27)
dτ dτ
2
 dt  A2
  = 1 +
 dτ  c2

dx
which lead to = at + constant, and therefore reproduce the usual equations
dt
for motion in a constant gravitational field.

Curvature of Space
It is clear from the above example that the geodesics in an arbitrary metric
space are, in general, curved lines. The space is then said to be curved (in contrast
to the flat spaces of inertial frames). A measure of the curvature of the space is
given by what is known as the curvature tensor. Only the simple case of
2-dimensions is considered here. The curvature of a surface in two dimensions,
as given by Rindler, is the following: Draw the geodesics starting from a point
P, and consider the circle formed by the locus of points which are at a distance
400 Elements of Modern Physics

a from P, along the geodesics. If the circumference of the circle is of length l,


the curvature is given by
3  2πa − l 
K= lim   (11.28)
π a → 0  a
3

In the simple case of a spherical surface, the distance dl between two
neighbouring points (see Fig. 11.1) is given by
( ∆r ) 2
(∆l)2 = 2 2
+ r 2 (∆φ) 2 (11.29)
1 − r /R
where R is the radius of the sphere, and r is the distance from the axis. The
distance along the geodesic from P, is given by

P a
2 2 1/2
Dr/(1 – r /R )
r

R
Dr

Fig. 11.1 Cross-section of a sphere.


r0
dr
a= ∫
0
(1 − r 2 /R 2 )1/2
(11.30)

= R sin–1 (r0/R)
while the length of the circumference of the circle is
l = 2πr0
= 2πR sin (a/R) (11.31)
Hence, the curvature is
3 2 πa − 2 πR sin (a/R )
K= lim
π a→0 a3
1
= 2 (11.32)
R
For a flat surface (R → ∞), the curvature is zero. In some cases, such as at
a saddle point, it can be negative.
An interesting point which emerges from our example is that a is multi-
valued (corresponding to going around the sphere an arbitrary number of times),
General Relativity and Cosmology 401

and r0 = R sin (a/R) is bounded by the value R. This is an elementary example of


a finite universe.

11.3 SCHWARZSCHILD METRIC


As the first application of curved spaces, we analyse the space-time near a
point mass M. This would simulate the situation in the neighbourhood of a
massive object. In the limit of mass M → 0, the flat space is described by the
element
1
(∆τ)2 = (∆t ) 2 − [(∆r ) 2 + r 2 (∆θ) 2 + r 2 sin 2 θ (∆φ) 2 ] (11.33)
c2
expressed in terms of spherical coordinates. With mass M ≠ 0, the distance
from the origin is no longer given by r, through the surface area of the sphere is
still 4πr2 and isotropy is maintained. The solutions of the field equations,
Eq. (11.14), were obtained by Schwarzschild (1916) and give the metric

2 1
(∆τ)2 = e(r ) ( ∆t ) − [ f (r ) (∆r ) 2 + r 2 (∆θ) 2 + r 2 sin 2 θ (∆φ) 2 ]
c2
(11.34)
1 2GM
with e(r) = =1− 2 (11.35)
f (r ) c r
This is known as the Schwarzschild metric. It may be observed that for
r → ∞, the Schwarzschild metric reduces to the metric of the flat space as it
should.
The interpretation of the different variables in Eq. (11.33) should be carefully
noted. The variable t is the coordinate which, in the absence of any gravitational
potential, would represent the time variable. The proper time interval ∆τ, on the
other hand, corresponds to the rate at which local clocks are running. Similarly,
the interpretation of r is that the distance measurements for ∆t = 0 give a value
[f(r) (∆r)2 + r2 (∆θ)2 + r2 sin2 θ (∆φ)2]1/2.
The Schwarzschild metric can be used to explain several important
observations. A few of the applications are discussed here.

Rate of Clocks
Consider an atomic clock in the presence of a gravitational field due to mass M.
The time interval it shows is
1/2
 2GM 
∆τ0 =  1 − 2  ∆t (11.36)
 c r 
402 Elements of Modern Physics

where ∆t is the displacement in the t-coordinate. The corresponding interval


shown by a clock at infinity, is
∆τ = ∆t (11.37)
Therefore,
1/2
 2GM 
∆τ0 = 1 − 2  ∆τ (11.38)
 c r 
which means that a clock in a gravitational field runs at a slower rate. This is
usually stated as implying that atoms (and human beings) in a gravitational
field live longer. Since frequency is inversely proportional to the time interval
this also gives the result that the frequency of radiation coming out of the field
is
1/2
 2GM 
v = 1 − 2  v0 (11.39)
 c r 
and in the weak field limit, i.e. for small GM/c2r,

v0 − v GM
≈ 2 (11.40)
v0 c r
This is essentially the gravitational red shift deduced earlier, in Eq. (11.4),
from the equivalence principle.

Shift of the Perihelion


The equations of motion in a gravitational field of mass M are obtained from
Eq. (11.24) for xµ = t, φ, and Eq. (11.23) with f given by (∆θ = 0, θ = π/2)
1/2
  dt 
2
f (r )  dr 
2
r2  dφ  
2

f = e (r )   − 2   − 2    (11.41)
  dτ  c  dτ  c  d τ  

They lead to

r2 =A

dt
e( r ) =B (11.42)

2 2 2
 dt  f (r )  dr  r2  dφ 
e( r )   − 2   − 2   =1
 dτ  c  dτ  c  dτ 
Using the first two equations and Eq. (11.35), the last equation simplifies
to:
General Relativity and Cosmology 403

 1  dr 2  2GM  A
2
GM 1 2 2
 2   + 1 − 2  2 − = c ( B − 1) (11.43)
 r  d φ   c r   2r r 2

where the constant on the right-hand side may be identified with the energy.
Compared to the corresponding Newton’s equation, this equation has the extra
 GMA2 
term −  2 3  . With this additional effective interaction, the planetary orbits
 c r 
are no longer closed ellipses but may be simulated by slowly rotating ellipses.
This gives rise to a shift of the perihelion of planets (perihelion is the point on
GM
the orbit nearest to the sun). Compared with the leading potential − , the
r
additional term is small,

GMA2 /c 2 r 3
≈ r2/c2 (11.44)
GM /r
which for Mercury is about 10–7-10–8. The quantitative effect of this term can be
calculated from perturbation theory. It gives rise to a rotation of the perihelion
of Mercury by about 43″ per century, in good agreement with the observed
rotation.

Bending of Light
The equations for the trajectory of light in the Schwarzschild metric can be
deduced from Eqs. (11.42). It should however be noted that since ∆τ = 0 for the
propagation of light, the constants A and B are infinite, though A/B is finite.
Dividing the last two equations by the first equation in Eqs. (11.42),

e(r ) dt B
=
r dφ
2
A
2 2
 dt  f (r )  dr  r2
e( r )   − 2   − 2 =0 (11.45)
 dφ  c  dφ  c
which in terms of x, = 1/r lead to
2
 dx  2 2GM 3
  +x − 2 x =D (11.46)
 dφ  c
404 Elements of Modern Physics

GM
where D = (cB/A)2. For = 0, x = D1/2 cos φ. Treating the gravitational term
c2
as a small perturbation, we find to first order in GM/c2

GMD
x = D1/2 cos φ + 2
(2 − cos 2 φ) (11.47)
c
The bending of light is then deduced by obtaining the angles f for r → ∞ or
x → 0, which to first order in GM/c2 satisfy the relation

2GMD1/2
cos φ ≈ − (11.48)
c2

 π 2GMD1/2 
or φ± = ±  +  (11.49)
2 c2 
Noting that D1/2 ≈ 1/rmin, the deflection of light comes out to be
∆φ ≡ (φ+ – φ– – π), (11.50)
4GM

c 2 rmin
For light just grazing the sun (M ≈ 2 × 1030 kg, rmin ≈ 7 × 108 m), this has a
value of ∆φ ≈ 1.75″. The bending of starlight grazing the sun during an eclipse
(so as to minimize glare), is found to be about 1.89″ which agrees well with
Einstein’s prediction. The corresponding prediction of Newton’s theory (particles
with velocity c accelerated by gravity) is half of Einstein’s prediction, i.e. about
0.875″.

Black Holes
Going back to the Schwarzschild metric in Eqs. (11.34), (11.35), it is seen that
the metric is singular at r = rs,
2GM
rs = (11.51)
c2
where rs is known as the Schwarzschild radius. In most cases, the Schwarzschild
radius is quite small, it is about 3 km for the sun. However, the metric is applicable
only outside the mass distribution. Inside the distribution, the metric is modified
to a nonsingular form. Therefore, in cases where rs is much smaller than the
radius of the mass distribution, the singularity is not relevant. On the other
hand, in the case of very massive stars (m > 3msun), it is expected that the stars
may ultimately collapse to size smaller than their Schwarzschild radius. These
General Relativity and Cosmology 405

stars whose radius is smaller than their Schwarzschild radius, are known as
black holes and the metric singularity gives rise to some unusual behaviour for
them.
For radial motion (dφ/dτ = 0), one obtains from Eqs. (11.42),
2
1  dr  2GM
B2 − 2   = 1 − c2r (11.52)
c  dτ 
If the particle starts at large r, with zero velocity, B2 = 1 and the equation of
motion reduces to
1/2
dr  2GM 
= −  (11.53)
dτ  r 
The solution to this equation is
2/3
1/3  8 
r(τ) = (2GM )  d − τ  (11.54)
 2 
where d is a constant. It shows that r, as a function of the proper time, does not
show any singular behaviour at r = rs, and therefore the fall through r = rs is
smooth.
The behaviour as seen by an observer outside the field, on the other hand, is
different. Since B = 1 for the case under consideration, it follows from
Eqs. (11.42) and (11.53), that the time interval dt, as seen by the observer, is


dt =
1 − rs /r

r1/2 dr
= (11.55)
− (2GM )1/2 (1 − rs /r )
Clearly the time interval ∆t → ∞ as r → rs. Physically, this means that with
respect to an outside observer, a black hole is frozen at r = rs.
It is easy to show that no light can escape from a black hole. For a light
signal one has ∆τ = 0. Therefore, from the last equation in Eqs. (11.42)

dr
dt = (11.56)
c |1 − rs /r |
which implies that the time taken by the signal to escape from the black hole is
infinite:
r2
dr
t12 = ∫
r1 < rs
c |1 − rs /r |
(11.57)
406 Elements of Modern Physics

lim t12 → ∞
r2 → rs

This explains the name ‘black hole’ given to the object. It should be
mentioned that this result does not take quantum effects into account. It has
been shown by Hawking that quantum effects do allow some radiation to come
out from the black holes so that, strictly speaking, a black hole is not a black
hole.

11.4 KINEMATICS OF THE UNIVERSE


In this section, the metric space of the universe is considered.
It may be expected that the large-scale properties of the universe are not
affected by local variations in the distribution of matter. This assumption, stated
in the form of a principle, greatly simplifies the analysis. The cosmological
principle states that the universe must appear the same to all observers who are
at rest with respect to matter in the neighbourhood. The universe must also be
isotropic. This means that clustering of galaxies is a local irregularity and the
cosmological scales are must larger than the sizes of the galaxies.
Observationally, the distribution of galaxies does appear to be homogeneous
and isotropic though there may be some small deviations. The concept of an
observer who is at rest with respect to local matter, known as a fundamental
observer, is very useful in the determination of the kinematics of the universe.
The history of each fundamental observer being the same, the time coordinate
can be linked with a local property, say the density, and may be taken to be the
proper time of the fundamental observer. It was shown by Robertson and Walker,
that the metric of the universe satisfying the cosmological principle, is of the
form

2 1  (∆r ) 2 
(∆τ)2 = (∆t ) −  + r 2 (∆θ) 2 + r 2 sin 2 θ (∆φ) 2 
c2 1 − K r
2

(11.58)
where K is the curvature of space. The spatial part of the metric is similar to the
metric of a spherical surface given in Eq. (11.29). The curvature of the space
may be written as

k
K= (11.59)
R 2 (t )
where R(t) is called the comic scale factor, with k = 1 for positive curvature,
k = –1 for negative curvature and k = 0 for flat space. The metric may be written
in a more convenient form in terms of the dimensionless variable σ,
General Relativity and Cosmology 407

σ = r/R(t) (11.60)
in terms of which the separation between fundamental observers does not change
with time. In terms of the co-moving coordinate σ, the metric is

2 R 2 (t )  ( ∆σ) 2 
(∆τ)2 = (∆t ) −  + σ 2 (∆θ) 2 + σ 2 sin 2 θ (∆φ) 2 
c2 1 − k σ
2

(11.61)
which is known as the Robertson-Walker metric.

Distances
Distances in the Robertson-Walker metric, with ∆θ = ∆φ = 0, are given by
a

D(t) = R (t ) ∫
0
(1 − k σ 2 )1/2
(11.62)

 R (t ) sin −1 σ for k = 1

=  R (t ) σ for k = 0 (11.63)
 R (t ) sinh − 1 σ for k = − 1

As might have been expected the distances between fundamental observers
are proportional to the scale factor. In the k = 1 case (positive curvature), the
distance D(t) is ambiguous to the extent of 2πnR(t) corresponding to going
around the closed universe n number of times. The surface area of the sphere is
given by 4πr2 or
S = 4πR2(t) sin2 [D(t)/R(t)] (11.64)
which is bounded. This is analogous to the length of the circumference of a
circle in the two dimensional case [see Eqs. (11.31) and (11.30)].

Velocities
The relative velocities of fundamental observers, are obtained from Eq. (11.62),
d D(t )
v= (11.65)
dt
R(t )
= D(t )
R(t )
This allows us to identify the constant of proportionality in Hubble’s law. It
is observed that distant galaxies appear to be moving away with speeds
proportional to their distances. This is described by Hubble’s law v = Hr where
H is called Hubble’s constant. Comparison of this relation with Eq. (11.65)
leads to
408 Elements of Modern Physics

R (t )
H(t) = (11.66)
R (t )
Thus, Hubble’s constant, in general, is a function of time. At present it has
a value of about 1.8 × 10–18 s–1.

Red Shifts
The Robertson-Walker metric provides the proper framework for the description
of cosmological red shifts. Since the proper time for the propagation of radiation
is zero, ∆τ = 0, the time interval for the propagation is given by
R(t ) ∆σ
∆t = (11.67)
c (1 − k σ2 )1/2
Consider now two crests emitted from a galaxy at times te and te + ∆te,
which are received by another galaxy at times t0 and t0 + ∆t0 respectively,
t0 σe
d (t ) 1 dσ

te
R (t )
=
c ∫ (1 − k σ 2 )1/2
(11.68)
σ0
σe
t0 + ∆t0 dt 1 dσ
∫te + ∆te R(t )
=
cσ ∫ (1 − k σ 2 )1/2
(11.69)
0

From these two relations, it follows that


t0 + ∆t0 dt dtte + ∆te
∫t0 R(t )
= ∫R(t )
te
(11.70)

For the short intervals under consideration,


∆t 0 R(t0 )
= (11.71)
∆t e R(te )
and since wavelengths are proportional to the time intervals,
λ0 R(t0 )
= (11.72)
λe R(te )
In the case of an expanding universe, R(t0) > R(te), which would explain the
observed red shifts. Expanding R(te) at t0, the red shift z, is
λ0
z= −1
λe
R (t0 )
= −1
1
R (t0 ) [1 + H 0 (te − t0 ) − q0 H 02 (te − t0 ) 2 + ...]
2
General Relativity and Cosmology 409

1  2 2
= (t0 − te ) H 0 +  q0 + 1 H 0 (t0 − te ) + ... (11.73)
2 
The parameter q0 known as the deceleration paramenter, is important in
the determination of the nature of the universe. For example, a positive q0 implies
a slowing down of the expansion of the universe. It is possible to estimate the
value of (t0 – te) from the study of the apparent brightness (essentially the radiation
received) of galaxies which together with a knowledge of the red shifts would
give an estimate of q0. Through the experimental uncertainties are too large at
present to yield a reliable value for q0 there are some indications that it is positive
(see Ref. 10).

11.5 DYNAMICS OF THE UNIVERSE


The evolution of the universe is determined by the time-dependence of R(t)
which is governed by the field equations. However, if the pressure, as a source
of gravity, can be neglected (this is certainly reasonable in the present era),
most of the general results can be obtained from Newtonian theory of gravity.
The discussion here will be based primarily on this simpler approach.
Consider a particle on the surface of a small sphere with its centre at the
origin. Then its equation of motion is

 4π 3 
Gm  r  ρ(t )
 3 
ma = − (11.74)
r2
which in terms of R(t) [see Eq. (11.60)] reads as

(t ) = − 4π Gρ(t ) R(t )


R (11.75)
3
1
Originally, Einstein had considered an additional repulsive term Λ R(t )
3
to counteract the attraction. Such a term is ignored in our simplified discussion.
Using ρ(t) = ρ(t0) R3 (t0)/R3 (t), Eq. (11.75) is integrated after multiplying by
2R to get

8π R 3 (t0 )
R 2 = G ρ(t0 ) − kc 2 (11.76)
3 R (t )
Here, the constant of integration – kc2, is a measure of the total energy of
the particle, and is related to the curvature index k(k = ± 1, 0) by the solutions to
the field equations. The relation suggests that the universe is closed for k = 1,
i.e. R (t) becomes zero for sufficiently large R(t) and changes its sign, but open
410 Elements of Modern Physics

for k = – 1 or 0, i.e. R (t) ≠ 0. This can be compared with what happens to a body
thrown up from the surface of the earth. If the initial velocity vin is less than the
escape velocity vesc (Etot < 0), the body will reach a maximum height and return
back to the earth, corresponding to the closed universe. If vin > vesc (Etot > 0), the
body will go on forever corresponding to the open universe. When vin = vesc
(Etot = 0), the body just manage to escape from the earth.
In all the three cases of Eq. (11.76), R is positive now, and was large at
earlier times. Hence the universe described by Eq. (11.76) had a big-bang origin.
The solution for R(t) can be obtained by integrating Eq. (11.76) in the usual
way:

 sin − 1 ( x1/2 ) − x1/2 (1 − x)1/2 for k = 1,


 Rm c   2 3/2
t=  . x for k = 0,
 c  3
1/2 1/2 −1 1/2
 x (1 + x) − sinh ( x ) for k = − 1,
(11.77)
8πG ρ(t0 ) R3 (t0 )
Rm = (11.78)
3c 2
where x = R(t)/Rm. The three cases are considered separately.
(i) k = 0, the Einstein-de Sitter model: In this case the three dimensional
space is flat. This gives a permanently expanding universe (see Fig. 11.2)
with R(t) given by
2/3
 3ct 
R(t) = Rm   (11.79)
 2 Rm 
Since R = RH where H is Hubble’s constant, one gets from Eq. (11.79),
t = 2/3 H. The age of the universe is then estimated to be
t0 ≈ 1.2 × 1010 years (11.80)
(ii) k = 1, oscillating universe: In this case the universe expands to a
maximum value of Rm at some time t1/2 and collapses back to the original
state (R → 0) at 2t1/2 (see Fig. 11.2). Using Eqs. (11.66) and (11.76),
R(t0) can be obtained in terms of ρ(t0). The value of Rm then is estimated
to be [ρ(t0) ≈ 1.2 × 10–26 kg/m3]
8πG ρ(t0 ) R3 (t0 )
Rm =
3c 2
≈ 1.3 × 1010 parsecs (11.81)
where 1 parses ≈ 3 × 1013 km ≈ 3.26 light years, 1 light year being the
distance travelled by light in 1 year. The collapse period is
General Relativity and Cosmology 411

k=–1

k=0
R(t)

Rm

k=1

0 t1/2 2t1/2

Fig. 11.2 The scale factor R(t) of the universe as a function of time t.

πRm
2t1/2 =
c
≈ 1.2 × 1011 years (11.82)
The present age of the universe is estimated from Eq. (11.77) to be
t0 ≈ 1010 years (11.83)
Since the universe is closed in this case, there are no problems of boundary
conditions at infinity, and Mach’s principle is incorporated in the theory.
(iii) k = – 1, ever-expanding universe: In this case, R(t) increases as t2/3 for
small t but increases as t for large t. In this model, following the same
steps as for k = 1, the present age of the universe comes out to be somewhat
larger, [ρ(t0) ≈ 3 × 10–28 kg/m3]
t0 ≈ 1.8 × 1010 years (11.84)
Though the evidence from various sources, e.g. from the estimates of
deceleration parameter, is not definitive, it does favour a closed universe. This
means that not only did the universe start with a big bang, it will also collapse to
the original dense state. What it will do beyond that point is not very clear—it
may end at that point or start again giving an oscillating universe.

11.6 THE EARLY UNIVERSE


The big bang thorias discussed above imply that the universe must have started
as a very dense, very hot object. This has certain interesting implications for the
present-day universe. Two specific results are discussed here.

Background Black-Body Radiation


At the beginning, the universe consisted of radiation and matter in equilibrium
at very high temperatures, confined to a small region. At these high temperatures,
412 Elements of Modern Physics

matter must have been in an ionized state. Since matter in ionized state has
much greater interaction with radiation than does matter in atomic state, the radiation
had an equilibrium black-body spectrum characterized by temperature T. In this
condition, the spectrum of photons, i.e. number of photons in the range v and v
+ dv, is deduced from the Planck expression [Eq. (2.12)],

8πv 2 V dv
dN = (11.85)
c 3 [exp (hv/kT ) − 1]
As the universe expands, the temperature falls. It can be shown that the
photon spectrum maintains its black-body characteristics, but with a lower
temperature. As a result of the expansion, V changes to V′,

R′3
V′ = V (11.86)
R3
Furthermore, the frequency gets red-shifted (see Eq. (11.72)), the red-shift
is present even for reflection by a body moving away) and is given by
R
v′ = v (11.87)
R′
Hence the spectrum is now given by

8π( R′v′/R ) 2 ( R 3V ′/R′3 ) ( R′dv′/R )


dN =
c3 [exp (hv′ R′/kRT ) − 1]

8πv′2 V ′ dv′
= (11.88)
c 3 [exp (hv′ / kT ′) − 1]

R
where T′ = T . Furthermore, the energy density is given by
R′
4
ε′rad = σ T ′4
c
4σR 4T 4
= (11.89)
cR′4
where σ is the Stefan-Boltzmann constant, σ = 5.67 × 10–8 in mks units. Such a
radiation was observed by Penzias and Wilson (1965), with a characteristic
temperature T′ = 2.7 K. This provides strong support to the big bang theory.
The equivalent mass density is

4
ρ′rad = 3
σT ′4
c
General Relativity and Cosmology 413

≈ 4.5 × 10–31 kg/m3 (11.90)


which is must smaller than the estimated galactic matter density of about
3 × 10–28 kg/m3. The present era is therefore known as the matter-dominated
era. It may be noted that unlike the radiation density which is proportional to 1/
R′4, the matter density varies as 1/R′3.
3
R
ρ′mat =  R′  ρmat (11.91)
 
which implies that
ρ′mat  R′  ρmat
ρ′rad =   (11.92)
 R  ρrad
Clearly at early times ρrad dominated over ρmat and that period is known as
the radiation-dominated era. At the transition between radiation and matter
dominated eras, ρmat ≈ ρrad so that
R′ ρ′mat
= ,
R ρ′rad
≈ 670 (11.93)
which is the present ratio of matter and radiation densities. The transition must
have taken place at
R′
T= T′
R
≈ 1810 K (11.94)
At around this temperature (actually at a somewhat higher temperature),
hydrogen is in an ionized state, so that at transition, there was enough matter
density to produce black-body spectrum for the radiation.
It can be shown that, at early times, the temperature was proportional
to t –1/2. Trad ~ 1/R(t). In the radiation-dominated era, solutions in Eq. (11.77) get
modified and R(t) ~ t1/2 so that Trad ~ t–1/2. General relativity gives the precise
relation as
1.5 × 1010
Trad = K (11.95)
t1/2
The temperatures at t ~ 1 s were high enough to have created electron-
positron pairs. At lower temperatures, T around 108 K, fusion reactions must
have been present. This is a possible explanation for the large amount of He
present, about 1 He atom for every 12 hydrogen atoms (the fusion reactions
going on in stars can account for only about 10% of this). It is only at later
stages that the condensation into galaxies took place, giving us essentially the
present universe.
414 Elements of Modern Physics

Radio Source Counting


Radio telescopes can detect sources at enormous distances and hence can be
used to obtain informations about the number of ratio sources with apparent
brightness (related to the radiation energy received) l > l0. Now, for a
homogeneous distribution of sources, since l = a/r2, the number of sources with
l > l0 is related to l0 by
3
N = br0

= b′/l03/2 (11.96)
2
where l0 = a/r0 , and a, b, b′ are constant. This implies that
3
ln N = − ln l0 + d (11.97)
2
Observationally, the number of faint sources appears to be larger,
ln N ~ – 2.0 ln l0 + d′ (11.98)
This is an interesting result. In the big-bang theory, it only means that since
the radiation from far away sources was emitted earlier, there must have been
(i) more numerous radio sources and/or (ii) brighter radio sources, at early times.

Particle Physics and Cosmology


Finally, we note the important role that our knowledge of particle properties, is
playing in our understanding of the universe. Two recent, though tentative,
developments are mentioned here.
The preponderance of matter over antimatter has been one of the puzzles in
cosmology. The solution to this puzzle may be in the difference in the properties
of baryons and antibaryons. The recent attempts at grand unification of strong,
weak and electromagnetic interactions (see Sec. 10.5), allow for differences in
the properties of matter and antimatter. Estimations based on these theories,
and also in rough agreement with the observed ratio of 10–9 for the number of
baryons (essentially protons and neutrons) to the number of photons.
There has been a question about the missing mass in the universe (apart
from the mass of the galaxies). Some recent experiments appear to suggest that
neutrinos may have nonzero mass. If this is confirmed, the neutrinos of the
universe may contribute a substantial amount of mass to the galaxies and the
universe.
A striking characteristic of recent developments in modern physic is that
observations in such diverse fields as elementary particles, nuclear physics,
atoms and molecules, solid state physics and astrophysics, are interrelated and
General Relativity and Cosmology 415

allow for a unified approach. In a sense, the distinctions between the different
domains are becoming blurred and ultimately there may be only the core of
basic laws of physics in terms of which all observations can be explained.

11.7 EXAMPLES
In this section, some examples related to the ideas discussed earlier are given.

Example 1
Olbers argued that in an infinite, static universe, every part of the sky must have
a brightness comparable to that of the sun.
In an infinite, static universe, every part of the sky will be covered by a star.
Let the visible part of one such star subtend a solid angle ∆Ω at the earth. Now,
the intensity of radiation received at the earth, due to this star at distance d, is
proportional to the exposed are d2∆ω of the star, and decreases as 1/d2. Therefore
it is proportional to d2∆Ω (1/d2) ~ ∆Ω which is independent of the distance of
the star. Therefore, an equivalent area (d2sun ∆Ω) of the star might as well be at
a distance equal to that of the sun. If we make the reasonable assumption that
the stars in general have approximately the same inherent brightness as that of
the sun, we would then expect every part of the sky to have the brightness of the
sun, day or night. Any attempt at an explanation in terms of absorption does not
succeed since at equilibrium the absorbing material must emit as much energy
as it absorbs.

Example 2
The red shift of radiation in the presence of a gravitational field, to the leading
order in the field, can be shown to follow from energy conservation.
Consider an atom of mass m2 which goes to a state of lower mass m1, with
the emission of a photon of frequency v0. Then energy conservation implies
m2c2 = m1c2 + hv0 (11.99)
If the atom is placed in a gravitational potential φ, its initial energy is
m2a2 + m2φ while the final energy is m1c2 + m1φ. Then the frequency v of the
photon, which comes out of the potential, is given by
hv = (m2c2 + m2φ) – (m1c2 + m1φ)
= (m2c2 – m1c2) (1 + φ/c2) (11.100)
Using Eq. (11.99)
v = v0 (1 + φc2) (11.101)
which agrees with the general expression in Eq. (11.39) to the leading order.
For emission from the sun, φ/c2 ≈ – 2 × 10–6.
416 Elements of Modern Physics

Example 3
When two clocks accelerate with respect to each other, they show different
proper times. This provides a solution to the twin paradox.
The metric near the surface of the earth, in the local inertial frame (at rest
with respect to distant galaxies) is the Schwarzschild metric in Eq. (11.34).
Consider two clocks, 1 and 2 which go around with angular velocities ω1 and
ω2. If they are together at the beginning and again at the end, the proper times
shown by them are
1/2
t  2GM r 2 2 
τi = ∫ 0
1 − 2 − 2 ωi  dt
 c r c 
1/2
 2GM r 2 
= 1 − 2 − 2 ωi2  t (11.102)
 c r c 
where t is the time coordinate. Therefore

τ1 − τ2 r 2 (ω22 − ω12 )
≈ (11.103)
τ1 2c 2
This relation was verified by keeping clock 1 at rest on Earth, ω1 = 2π rad/
v
day, and taking clock 2 around the earth with velocity v, ω2 = ω1 ± . For
r
v ≈ 800 km/h,
τ1 − τ2 v
≈ 1.42 × 10− 12 for ω2 = ω1 + ,
τ1 r
v
≈ − 0.87 × 10− 12 for ω2 = ω1 − (11.104)
r
It is important to note that a clock with greater acceleration shows smaller
time, which explains the longer lifetime observed for particles going around in
accelerators.

Example 4
An interesting idea in cosmology is what is called the object horizon. This is
the value σoh of the farthest object which is visible to us. The signal reaching us,
from this object, must have been emitted at the beginning of the universe, i.e. at
t = 0. Since ∆τ = 0 for the propagation of light, one has [Eq. (11.61)]
t0 dt 1 σoh dσ
∫0 R(t )
=
c ∫
0 (1 − k σ2 )1/2
(11.105)
General Relativity and Cosmology 417

For the special case of k = 1,

 t0 dt 
σoh = sin  c ∫
 0 R (t ) 
 (11.106)

The distance of the horizon is [Eq. (11.62)]


σoh dσ
doh = R(t0 ) ∫0 (1 − k σ2 )1/2
c dt t0
= R(t0 ) ∫
R(t )
0
(11.107)

Using the solutions in Eq. (11.77) it has been estimated that


π
doh ≈ R(t0 ) for k = 1,
2
≈ 1010 parsecs (11.108)

PROBLEMS
1. A photon is moving horizontally on the surface of the earth. What is the
height through which falls in travelling 100 m?
2. Starting from the flat-space metric, obtain the metric for the frame which
rotates with angular velocity ω along the z-direction. Write down the
equations for geodesics in this frame. Exhibit the coriolis and centrifugal
forces in the nonrelativistic approximation.
3. For a particle going around an accelerator, show that the lifetime is given
by τ = τ0 (1 – ω2r2/c2)–1/2, where ω is the angular velocity, and r is the
radius of the orbit. What is the expression for τ if ω is changing but
r remains a constant?
4. Curvature of a surface may be defined in terms of area also. Show that
curvature of a spherical surface is given by

12  πa 2 − A 
K= lim  
π a → 0  a4 
where A is the area of the surface and a is the distance of any point on the
circumference of the circle from the centre of the surface.
5. What is the Schwarzschild radius of the earth?
6. Show that for circular motion in the Schwarzschild metric
2
 dφ   3GM   GM 
  1 −  = 3 
 dτ   rc 2   r 
418 Elements of Modern Physics

In the nonrelativistic limit, this relations tends to the usual relation


ω2 = GM/r3. (It is simpler to start with the geodesic equation for r.)
7. A photon may be bound in a closed orbit by the potential of a black hole.
Show that for a circular orbit


r = 3GM /c 2 and = c /(31/2 r ) .
dt
8. Consider the Robertson-Walker metric with k = 0. If a signal is emitted
at t and received at t0,
(a) show that
D(t )

R(t )
1/3
 12c  1/3 1/3
=   (t0 − t )
 Rm 
(b) show that
2
v(t) = D(t )
3t
(c) show that
3 v
D(t) = t0 .
2 (1 + v/2c)3
9. In the steady state theory of the universe, the decrease in the density of
matter due to expansion of the universe is compensated by continuous
creation of matter. Using continuity equation show that the rate of creation
is given by
d ρc
= 3ρ0H
dt
Given that ρ0 ≈ 3 × 10–28 kg/m3, estimate the rate of creation in terms of
protons/m3/s. Argue that the steady state theory does not imply a sky with
a uniform brightness equal to that of the sun.
References
General Books on Modern Physics
1. Leighton R.B., Principles of Modern Physics, McGraw-Hill, New York,
1959.
2. Richtmyer F.K., E.H. Kennard and J.N. Copper, Introduction to Modern
Physics, McGraw-Hill, New York, 1969.
3. French A.P., Principles of Modern Physics, John Wiley, London, 1958.
4. Weidner R.T. and R.L. Sells, Elementary Modern Physics, Allyn and
Bacon, Boston, 1980.
5. Sproull R.L. and W.A. Phillips, Modern Physics, John Wiley, New York,
1980.
6. Savelyev I.V., Physics, a General Course, vol. III, Mir, Moscow, 1981.
7. Beiser A., Perspectives of Modern Physics, McGraw-Hill, New York,
1973.

Special and General Relativity


8. Bergmann P.G., Introduction to the Theory of Relativity, Prentice-Hall,
Englewood Cliffs, 1942.
9. Rindler W., Essential Relativity, Van Nostrand Reinhold, New York, 1969.
10. Berry M., Principles of Cosmology and Gravitation, Cambridge
University Press, Cambridge, 1976.
11. Landau L. and E. Lifshitz, The Classical Theory of Fields, Addison-
Wesley, Reading, 1951.
12. Sard R.D., Relativistic Mechanics, W.A. Benjamin, New York, 1970.
13. Weinberg S., Gravitation and Cosmology, John Wiely, New York, 1972.

Quantum Mechanics
14. Pauling L. and E.B. Wilson, Introduction to Quantum Mechanics,
McGraw-Hill, New York, 1935.
15. Fermi E., Notes an Quantum Mechanics, University of Chicago Press,
Chicago, 1961.
© The Editor(s) (if applicable) and The Author(s), under exclusive license 419
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
420 Elements of Modern Physics

16. Landau L. and E. Lifshitz, Quantum Mechanics–Nonrelativistic Theory,


Pergamon, London, 1958.
17. Schiff L.I., Quantum Mechanics, McGraw-Hill, New York, 1955.
18. Merzbacher E., Quantum Mechanics, John Wiley, New York, 1961.

Atomic and Molecular Physics


19. Peaslee D.C., Elements of Atomic Physics, Prentice-Hall, Englewood
Cliffs, 1955.
20. Cagnac B. and J.C. Pebay-Peyroula, Modern Atomic Physics: Fundamental
Principles, Macmillan, London, 1975.
21. Herzberg G., Spectra of Diatomic Molecules, D. Van Nostrand, Princeton,
1950.
22. Cagnac B. and J.C. Pebay-Peyroula, Modern Atomic Physics: Quantum
Theory and its Applications, Macmillan, London, 1975.
23. Dunford H.B., Elements of Diatomic Molecular Spectra, Addison-Wesley,
Reading, 1968.
24. Karplus M. and R.N. Porter, Atoms and Molecules, Benjamin, Boston,
1970.
25. Fano U. and L. Fano, Physics of Atoms and Molecules, University of
Chicago Press, Chicago, 1970.

Quantum Statistics and Solid State Physics


26. Mandl F., Statistical Physics, John Wiely, London, 1971.
27. Gopal E.S.R., Statistical Mechanics and Properties of Matter, Ellis
Horwood, Westergate, 1974.
28. Landau L. and E. Lifshitz, Statistical Physics, Pergamon, London, 1959.
29. Hart-Davis A., Solids, McGraw-Hill, London, 1975.
30. Rosenberg M., The Solid State, Clarendon Press, Oxford, 1975.
31. Hall H.E., Solid State Physics, John Wiley, London, 1974.
32. Rudden M.N. and J. Wilson, Elements of Solid State Physics, John Wiley,
Chichester, 1980.
33. Ashcroft N.W. and N.D. Mermin, Solid State Physics, Holt, Rinehart
and Winston, New York, 1976.
34. Kittel C., Introduction to Solid State Physics, John Wiley, New York,
1976.
35. Ali Omar M., Elementary Solid State Physics, Addison-Wesley, Reading,
1975.
References 421

Nuclear Physics
36. Elton L.R.B., Introductory Nuclear Theory, Interscience, New York, 1959.
37. Segre E., Nuclei and Particles, Benjamin, New York, 1965.
38. Enge H.A., Introduction to Nuclear Physics, Addison-Wesley, Reading,
1966.
39. Bethe H.A. and P. Morrison, Elementary Nuclear Theory, John Wiley,
New York, 1956.
40. Preston M.A., Physics of the Nucleus, Addison-Wesley, Reading, 1962.
41. Murray R.L., Nuclear Energy, Pergamon, New York, 1975.

Elementary Particles
42. Longo M.J., Fundamentals of Elementary Particles, McGraw-Hill, New
York, 1973.
43. Yang C.N., Elementary Particles, Princeton University Press, Princeton,
1962.
44. Livigston M.S., Particle Physics, McGraw-Hill, New York, 1968.
Answers to Problems
Chapter 1
1. Time period is (1 + v/c)/(1 – β2)1/2
5. 14%; 10 km 7. 0.9974c
9. (M2 – m2) c2/2M; 224.6 MeV, 3.4 eV
11. 0.875c, 0.999994c 12. 287 km/s; 6556.7 Å

Chapter 2
1. 5800 K; for significant number of hydrogen atoms to be in the excited
states, kT ~ 10 eV
2. 7.13 × 103 J/s; 0.019 J/s 4. 19; 4 × 1017 m–2 s–1
5. [2mc2 (hv – ε) + h2v2 – 2hv (2mc2 (hv – ε))1/2 cos φ] 2 Mc2

5. θ = 180°; 2h2 v02 /mc2 7. 0.1484 Å, 0.1 Å; θ = 54°


8. [(m c + h v ) – mc ]/hv; hv/2mc2; 10 keV
2 4 2 2 1/2 2

9. 1.2 Å; yes 10. n = 1, d = 0.5 Å


12. 0.7 MeV; 40 MeV 13. n  ω
14. 0.5 × 10–13 m; 25.3 MeV 16. 1/1836
s /(2 − s )
s  mgs 
17. 2 18. nω; g  − 1 2 2 
 2   n 
e Bn
19. (1 ± 1) for parallel and anti-parallel cases
2m
20. n ω 22. 1216 Å, 1026 Å
23. 6000 K 24. 14 K

Chapter 3
1/2
1 h  sin (ka/)
1.  
π  2a  k
2. T = 4r/(1 + r2), r = (E + V0)1/2/E1/2, R = 1 – T
3. v = hn/4ml2 = v/2l
4. A = (πa3)–1/2, a = 4πε0h2/me2, E = – 2/2ma2; P = 13 e–4
(8/3)1/2 (2a )5/4
5. a = (km)1/2/2 . A =
π1/4

© The Editor(s) (if applicable) and The Author(s), under exclusive license 423
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
424 Elements of Modern Physics

6. tan qa = α/q, cot qa = – α/q, α = (– 2mE)1/2/ , q = [2m(V0 + E)]1/2/


1/2
 1 1 
7. σ(x) = l  − 2 2  , σ( p ) = πn /l
 12 2π n 
2
8. cot qa = – α/q 9. <Lz> = <pz> = <px> = 0, < pz > = 3a 2
10. 3λ/4α4 11. ∆E0 = 0, ∆E1 = ± λl2 (16/9π2)2

Chapter 4
1. 2E1, – E1 2. 13e–4, 0
3. E = E1/4 4. a1, 4e–2/a1
5. 10 Z E1
–10 2

7. Eµ = bEe, rµ = re/b, b = mµ(mp + me)/me(mp + mµ)


 11 2  − 1
8. v1 = 2.462 × 1015 1 + α s ,
 48 

 5 2  −1
v2 = 2.462 × 1015 1 + α  s
 16 
9. 7 lines
10. S0, S1, P0, P1, P1, P2; P1 → S1, P1 → S0,
P0 → S1, P2 → S1, P1 → S1, P1 → S0

Chapter 5
3. 1, 9
6. 2S1/2 , 1S0, 2D3/2, 3F2, 4F3/2, 7S3, 6S5/2, 5D4, 4F9/2, 3F4, 2S1/2, 1S0
7. CLS 2 = 0.6 eV 8. 2.1 eV
9. 25.6 kV, 3.8 kV; 0.57 Å; 0.48 Å, 3.3 Å
10. 40; 11.1 keV 11. 4.13 × 10–15
12. 6.8 keV 13. 3.8 eV
14. 470 N/m, 1.27 Å 15. 0.059

Chapter 6
eB
1. ∆E = ± , 0; ± 0.00160 Å, 0; ± 0.208 Å, 0
2m
eB  4 4 
2. ∆v =  M J − M J ′  ; max ∆λ is 0.5 Å
4πm  5 3 
Answers to Problems 425

eB  4 2 
3. ∆v =  M J − M J ′
4πm  5 3 
4. g = 3/2; 4.2 × 1010 s–1 6. g = 0.40
7. 0.025
8. 2 exp (– E2/kT) = exp (– E1/kT) + exp (– E3/kT)
9. 4.55 × 10–4 % 10. ∆λ = 3.98 Å, 6.64 Å

Chapter 7
1. Allowed occupation numbers are A = (1, 0, 0, 2),
B = (0, 1, 1, 1), C = (0, 0, 3, 0)
(a) 3, 12, 8 (b) 1, 2, 4 (c) 0, 1, 0 are the numbers of arrangements for A,
B,C
1
2. kT; 1.35 × 10–6 3. 7.5 × 10–7 %
2
4. 11.7 %, 1.6 % 5. 2.8 R
2 θ/T
T  x 2 dx
6. E = 4 N 0 kT  
θ ∫0
(e x − 1)
7. 350 K

3 θ/T
 kT  x 2 dx
8. R mol–1 K–1 9. 4πV (2vt−3 + vl−3 ) 
 h 
 ∫0
(e x − 1)
11. 5.5 eV, 3.3 eV; 1 – 1.8 × 10 , 1–5.1 × 10
–5 –5

12. 0.018 R, 2.8 R 13. 4.25 × 104 K


14. 1.97 × 1024 m–3 15. 2.6 J/ms K
16. 7 × 1011 s–1 17. 1.96 × 10–5 V

Chapter 8
1. 6.7 eV, 0.89 eV 3. C = 8.7 keV, a = 0.31 Å
4. 6.0 × 10 mol
23 –1
5. sin θ = 0.50, 0.71, 0.87 ; no; no
6. R (3 –1); R (2/3 –1); R (3 /2 –1)
1/2 1/2 1/2 1/2

8. 13.6 Ω–1 m–1; – 1.1 × 10–3 V m3/A W


9. 3 10. 0.73 eV
11. 0. 014 K
12. 6 × 10–10 Ω–1 m–1; 1.15 × 102 Ω–1 m–1
13. 0.22 A, 4.6 × 10–8 A 14. 9.15 × 10–7 m
16. 0.83 × 10–2 Ω m 18. 2.9 × 1025 m–3
19. 0.85 V
426 Elements of Modern Physics

Neg B − y eg B
21. E = (e − e y )/ (e− y + e y ) y =
4m 4mkT
22. 3T

Chapter 9
2. ∆E ≈ 0.72 A2/3 MeV 3. 0.32
6. 4.79 e/2m p for l = 2

7. s1/2, p1/2, s1/2, h9/2; – 1.91, – 0.26, 2.79, 2.62 in units of e/2m p
8. 3.9 × 10–54 kg. m2, Itot = 8.7 × 10–54 kg. m2,
E = 0.336 Mev
9. 0.033 MeV and 0.076 MeV
10. Ru 
β
→ Rh 
β
→ Pd, Ag 
β+
→ Pd, Cd 
β+
→ Ag;
Ag and Cd decay also by election capture.
11. Ni 
β
→ Cu, Zn 
→ Cu by electron capture
12. 27.9 MeV, 4.23 MeV, 0.07 MeV
13. 8.36 Mev, 0.076 Mev
14. 31. 9739 mu
15. 5.5 × 10–5 MeV, 4.6 × 10–22 kg m/s
16. 8; 6; 1.54 × 10–10 gm; 1.45 × 109 years
17. 2.05 MeV 18. 1.93 × 10–26 m2
19. 53 g
20. Fusion would require 4.6% change and fission 30% change in 5 × 109
years

Chapter 10

( E + mc 2 )1/2
2. hv = mc 2
( E + mc 2 )1/ 2 − ( E − mc 2 )1/2 cos θ
3. (mK2 –3mπ2)/2mK
4. Σ0→ Λ0 + γ decay is due to electromagnetic interaction, whereas the
decay of Σ± is due to weak interaction
5. About 2 × 10–18 m 6. 4 × 103 s
7. 0.82 Wb/m2
Answers to Problems 427

Chapter 11
1. 5.5 × 10–13 m
ω2 2 ω ω
2. g00= 1 − 2
( x + y 2 ), g0 x = 2 y , g 0 y = − 2 x
c c c
τ 1/2
 r2 2 

3. τ0 =  1 − 2 ω 
0
c 
dt 5. 9 × 10–3 ms

9. 2 × 10 m–3 s–1
–18
Index

A Cosmic Rays, 381


Cosmology, 391
Alpha Decay, 340 Covalent Bonds, 160, 258
Amorphous Semiconductors, 292 Cross-Section, 346
Angular Momentum, 90, 323 Crystal Structures, 260
Antiferromagnetism, 301 Curvature of Space, 399
Applications of Fermi-Dirac Distribu- Curved Space-Time, 395
tion, 232
Applications of Lasers, 192 D
Atomic and Molecular Beam Experi-
ments, 199 Degenerate Gas Model, 334
Atomic Spectra, 46, 142 Diamagnetism, 293
Atoms and Molecules, 131 Dielectric Properties, 302
Auger Effect, 158 Diffraction by a Lattice, 265
Dirac Equation, 122
B Direct Processes, 348
Directions and Planes in Crystals, 264
Band Theory of Solids, 267 Distinguishable Arrangements, 210
Bending of Light, 403 Dynamics of the Universe, 409
Beta Decay, 338
Binding Energies, 319 E
Binding Forces in Solids, 256
Black-Body Radiation, 32 Effective Mass, 271
Bohr Model, 51 Electric Quadrupole Moment, 324
Bose-Einstein Condensation, 225, 230 Electromagnetic Interaction, 18, 372
Breeder Reactors, 354 Electron Spin, 107
Electronic Polarizability, 303
C Electronic Structure of Elements, 139
Elementary Particles, 365, 366
Collective Model, 332 Elements of Quantum Theory, 65
Colliding Beams, 383 Emission Spectrum, 152
Compound Nucleus, 347 Emulsion Chamber, 387
Compton Effect, 40 Energy Gap, 240
Control Rods, 353 Examples of One-Electron Atoms, 118
Controlled Fusion, 355 Exchange Symmetry of Wave Functions,
Coolant, 353 132

© The Editor(s) (if applicable) and The Author(s), under exclusive license 429
to Springer Nature Switzerland AG 2021
S. H. Patil, Elements of Modern Physics,
https://doi.org/10.1007/978-3-030-70143-7
430 Elements of Modern Physics

F L
Fabrication of Semiconductor Devices, Laser Cooling, 196
290 Lasers and Masers, 188
Ferrimagnetism, 301 Length Contraction, 11
Ferroelectric Crystals, 306 Lifetimes and Linewidths, 186
Ferromagnetism, 297 Light Emitting Diodes, 289
Fine Structure of One-Electron Atomic Lorentz Four-Vectors, 14
Spectra, 110 Lorentz Transformations, 6
Fission Reactors, 350
Frames of Reference, 392 M
Free Particle, 74
Mach’s Principle, 393
Free-Electron Paramagnetism, 294
Magnetic Moment, 323
Free-Electron Theory of Metals, 232
Magnetic Properties, 241, 292
G Magnetic Resonance Experiments, 198
Medical MRI, 201
Galilean Transformations, 2 Metallic Bonds, 258
Gamma Decay, 343 Metric Tensor of the Space, 395
Geiger Counter, 387 Models of the Nucleus, 328
General Relativity, 391 Moderators, 353
Geodesics, 397 Molecular Bonding, 159
Molecular Spectra, 162
H Moseley Diagram, 156
Holography, 194 Moseley’s Law, 155
Hydrogen, 118 Muonic Helium, 119
Hydrogen Bonds, 259 Muonium, 119
Hydrogen Spectrum, 46
N
I Nearly Free Electron Approximation,
Inertial Frames of Reference, 2 269
Interaction with External Fields, 173 Neutron Economy, 352
Interaction with Radiation, 181 Nonlinear Optics, 193
Ionic Bonds, 159, 256 Nuclear Constituents, 318
Ionic Polarizability, 305 Nuclear Fission, 342
Ionization Potential, 138 Nuclear Forces, 325
Isospin Symmetry, 369 Nuclear Model of the Atom, 49
Nuclear Radius, 322
J Nuclear Reactions, 345
Nuclear Stability, 337
Josephson Junctions, 242 Nucleon-Nucleon Interaction, 327
K O
KCl Crystal Structure, 257 One-Electron Atom, 101
Kinematics of the Universe, 406 Orientational Polarizability, 305
Index 431

P Shells and Subshells in Atoms, 135


Simple Harmonic Oscillator, 87
Paramagnetism, 295 Simultaneity and Time Dilation, 8
Parity Violation, 376 Small Perturbations, 89
Particle in a Box, 83 Solid State Physics, 255
Paschen-Back Effect, 179 Solutions of the Schrödinger Equation,
Periodic Table, 137 102
Perovskite structure, 245 Specific Heat of Solids, 221
Photodiodes, 287 Specific Heats of Gases, 218
Photoelectric Effect, 37 Spontaneous Transitions, 184
Photon Gas, 224, 225 Statistical Distributions, 213
Piezo-electricity, 307 Step Potential, 78
Positronium, 118 Strangeness, 375
Postulates of Quantum Mechanics, 70 Strength of Nuclear Interaction, 328
Postulates of Special Relativity, 5 Strong Interaction, 368
Power reactor, 354 Superconductivity, 238
Principle of Equivalence, 393 Symmetric Molecules, 164
Production and Detection of Particles, Synchrotron, 382
380
Properties of the Nucleus, 318 T
Q The Cyclotron, 382
The Early Universe, 411
Quantization of Flux, 241 The Hamiltonian, 174
Quantum Dot, 86 The Nucleus, 317
Quantum Ideas, 31 The Wave Function, 67
Quantum Statistics, 209 Thermonuclear Fusion, 355
Quantum Well Laser, 87 Thought Experiment, 66
Tight Binding Approximation, 268
R Total Angular Momentum, 109
Radio Source Counting, 414 Transformation of Velocities, 12
Radioactive Series, 343 Transistor, 286
Raman Effect, 201
Reactor Fuel, 351
U
Russel-Saunders or LS Coupling, 143 Uncontrolled Chain Reactions, 354
Rydberg Atoms, 119 Uncontrolled Fusion, 357
Unified Approach, 379
S Unstable Nuclei, 321
Schrödinger Equation for Spin 1/2
Particles, 120
V
Schwarzschild Metric, 401 V-A Theory of Weak Interaction, 378
Semiconductor Devices, 283 Van de Graaff Generator, 381
Semiconductor Diode Laser, 290 Van der Waals Bonds, 259
Semiconductor Diodes, 283 Van der Waals Forces, 159
Semiconductors, 274 Velocity of Light, 3
Shell Model, 329
432 Elements of Modern Physics

W Y
Wave Nature of Particles, 43 Yukawa Forces, 325
Wave Packet, 76
Weak Interaction, 373 Z
Weizsacker’s Mass Formula, 336 Zeeman Effect, 175
X Zero-Mass Particles and Doppler Shift,
21
X-ray Absorption Spectrum, 156
X-ray Spectra, 151

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy