Part EM - Classical Electrodynamics
Part EM - Classical Electrodynamics
Academic Commons
2024
Recommended Citation
Likharev, Konstantin, "Part EM: Classical Electrodynamics" (2024). Essential Graduate Physics. 3.
https://commons.library.stonybrook.edu/egp/3
This Book is brought to you for free and open access by the Department of Physics and Astronomy at Academic
Commons. It has been accepted for inclusion in Essential Graduate Physics by an authorized administrator of
Academic Commons. For more information, please contact mona.ramonetti@stonybrook.edu,
hu.wang.2@stonybrook.edu.
Konstantin K. Likharev
Essential Graduate Physics
Lecture Notes and Problems
Part EM:
Classical Electrodynamics
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
Table of Contents
***
Supplemental file Exercise Problems with Model Solutions (300 problems, 420 pp.)
is available online:
https://essentialgraduatephysics.org/Files/EM%20exercises.pdf .
B/W paperback copies of these materials are available on Amazon.com:
https://www.amazon.com/gp/product/B0D7SKPQF9 .
Additional file Test Problems with Model Solutions (52 problems, 46 pp.)
is available for course instructors from the author upon request – see Front Matter.
***
Introductory Remarks
The structure of this classical electrodynamics course is quite traditional. Namely, in order to
address the most important subjects of the field, which involve not only charged point particles but also
conducting, dielectric, and magnetic media, the electromagnetic interactions are discussed in parallel
with simple models of the electric and magnetic properties of most common materials.
Also following tradition, I use this part of my series (notably Chapter 2) as a convenient platform
for the discussion of various methods of the solution of partial differential equations, including the use
of the most important systems of curvilinear orthogonal coordinates and special functions.
One more traditional part of classical electrodynamics is an introduction to special relativity (in
Chapter 9) because although this topic includes a substantial classical mechanics component, it is the
electrodynamics that makes a relativistic analysis unavoidable.
qk '
Fk'k R kk' rk rk '
rk ' n kk' qk
Fkk '
rk
0
Fig. 1.1. Coulomb force directions (for the case qkqk’ > 0).
1 For remedial reading, I can recommend, for example, D. Griffiths, Introduction to Electrodynamics, 4th ed.,
Pearson, 2015.
2 On top of the more general notions of the classical Newtonian space, point particles and forces, as used in
classical mechanics – see, e.g., CM Sec. 1.1.
3 Formulated in 1785 by Charles-Augustin de Coulomb, on the basis of his earlier experiments, in turn rooted in
prior studies of electrostatic phenomena, with notable contributions by William Gilbert, Otto von Guericke,
Charles François de Cisternay Du Fay, Benjamin Franklin, and Henry Cavendish.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
I am confident that this law is very familiar to the reader, but a few comments may still be due:
(i) Flipping the indices k and k’, we see that Eq. (1) complies with the 3rd Newton law: the
reciprocal force is equal in magnitude but opposite in direction: Fk’k = –Fkk’.
(ii) Since the vector Rkk’ rk – rk’, by its definition, is directed from point rk’ toward point rk
(Fig. 1), Eq. (1) correctly describes the experimental fact that charges of the same sign (i.e. with qkqk’ >
0) repulse, while those with opposite signs (qkqk’ < 0) attract each other.
(iii) In some textbooks, the Coulomb law (1) is given with the qualifier “in free space” or “in
vacuum”. However, actually, Eq. (1) remains valid even in the presence of any other charges – for
example, of internal charges in a quasi-continuous medium that may surround the two charges (number
k and k’) under consideration. The confusion stems from the fact, to be discussed in detail in Chapter 3
below, that in some cases it is convenient to formally represent the effect of the other charges as an
effective (rather than actual!) modification of the Coulomb law.
(iv) The constant in Eq. (1) depends on the system of units we use. In the Gaussian units, is
set to 1, for the price of introducing a special unit of charge (the statcoulomb) that would make
experimental data compatible with Eq. (1) if the force Fkk’ is measured in the Gaussian units (dynes). On
the other hand, in the International System (“SI”) of units, the charge’s unit is one coulomb
(abbreviated C), and is different from 1:
1 in
SI , (1.2) SI units
4 0
where 0 8.85410-12 is called the electric constant.4
Unfortunately, the continuing struggle between zealous proponents of these two systems of units
bears all the not-so-nice features of a religious war, with a similarly slim chance for any side to win it in
any foreseeable future. In my humble view, each of these systems has its advantages and handicaps (to
be noted on several occasions below), and every educated physicist should have no problem with using
any of them. Following insisting recommendations of international scientific unions, I am using the SI
units throughout my series. However, for the readers’ convenience, in this course (where the difference
between the Gaussian and SI systems is especially significant) I will write the most important formulas
with the constant (2) clearly displayed – for example, the combination of Eqs. (1) and (2) as
1 rk rk'
Fkk' q k q k' , (1.3)
4 0 rk rk '
3
so the formal transfer to the Gaussian units may be performed just by dropping the front fraction. (In the
rare cases when the transfer is not obvious, I will duplicate formulas in the Gaussian units.)
Besides Eq. (3), another key experimental law of electrostatics is the linear superposition
principle: the electrostatic forces exerted on some point charge (say, qk) by other charges add up as
vectors, forming the net force
4 Since 2018, one coulomb is defined, in the “legal” metrology, as a certain exactly fixed number of the
fundamental electric charges e, and the “legal” SI value of 0 is not more exactly equal to 107/4c2 (where c is the
speed of light) as it was before, but remains extremely close to that fraction, with the relative difference of the
order of 10-10 – see appendix UCA: Selected Units and Constants. In this series, this minute difference is ignored.
Chapter 1 Page 2 of 20
Essential Graduate Physics EM: Classical Electrodynamics
where the summation is extended over all charges but qk, and the partial force Fkk’ is described by Eq.
(3). The fact that the sum is restricted to k’ k means that a point charge, in statics, does not interact
with itself. This fact may look obvious from Eq. (3), whose right-hand side diverges at rk rk’, but
becomes less evident (though still true) in quantum mechanics – where the charge of even an elementary
particle is effectively spread around some volume, together with the particle’s wavefunction.5
Now we may combine Eqs. (3) and (4) to get the following expression for the net force F acting
on a probe charge q located at point r:
1 r rk'
F r q q k' . (1.5)
4 0 rk ' r r rk'
3
This equality implies that it makes sense to introduce the notion of the electric field (as an entity
independent of q), whose distribution in space is characterized by the following vector:
Electric
Fr
field: Er , (1.6)
definition q
formally called the electric field strength – but much more frequently, just the “electric field”. In these
terms, Eq. (5) becomes
1 r rk'
Electric field of
point charges
E(r )
4 0 rk ' r
qk '
r rk'
3
. (1.7)
Being just convenient is electrostatics, the notion of the field becomes unavoidable for the description of
time-dependent phenomena (such as electromagnetic waves, see Chapter 7 and on), where the
electromagnetic field shows up as a specific form of matter, different from the usual “material” particles
– even though quantum electrodynamics (to be reviewed in QM Chapter 9) offers their joint description.
Many real-world problems involve multiple point charges located so closely that it is possible to
approximate them with a continuous charge distribution. Indeed, let us consider a group of many (dN >>
1) close charges, located at points rk’, all within an elementary volume d3r’. For relatively distant field
observation points, with r – rk’ >> dr’, the geometrical factor in the corresponding terms of Eq. (7) is
essentially the same. As a result, these charges may be treated as a single elementary charge dQ(r’).
Since at dN >> 1, this elementary charge is proportional to the elementary volume d3r’, we can define
the local 3D charge density (r’) by the following relation:
and rewrite Eq. (7) as an integral (over the volume containing all essential charges):
Electric field of 1 r r'
continuous
charge
E(r )
4 0 (r' ) r r'
3
d 3 r' . (1.9)
5 Note that some widely used approximations, e.g., the density functional theory (DFT) of multiparticle systems,
essentially violate this law, thus limiting their accuracy and applicability – see, e.g., QM Sec. 8.4.
Chapter 1 Page 3 of 20
Essential Graduate Physics EM: Classical Electrodynamics
Note that for a continuous, smooth charge density (r’), the integral in Eq. (9) does not diverge at R
r – r’ 0, because in this limit, the fraction under the integral increases as R-2, i.e. slower than the
decrease of the elementary volume d3r’, proportional to R3.
Let me emphasize the dual use of Eq. (9). In the case when (r) is a continuous function
representing the average charge defined by Eq. (8), Eq. (9) is not valid at distances r – rk’ of the order
of the distance between the adjacent point charges, i.e. does not describe rapid variations of the electric
field at these distances. Such approximate, smoothly changing field E(r), is called macroscopic; we will
repeatedly return to this notion in the following chapters. On the other hand, Eq. (9) may be also used
for the description of the exact (frequently called microscopic) field of discrete point charges, by
employing the notion of Dirac’s delta function, which is the mathematical description of a very sharp
function equal to zero everywhere but one point, and still having a finite integral (equal to 1).6 Indeed, in
this formalism, a set of point charges qk’ located in points rk’ may be represented by the pseudo-
continuous density
(r' ) q k ' (r' rk ' ). (1.10)
k'
Plugging this expression into Eq. (9), we return to its exact, discrete version (7). In this sense, Eq. (9) is
exact, and we may use it as the general expression for the electric field.
(r' )d 3 r'
r' R dE z dE cos
' h
0 r Fig. 1.2. One of the simplest problems of
a ra dE electrostatics: the electric field produced by
a spherically-symmetric charge distribution.
We may immediately use the problem’s symmetry to argue that the electric field should be also
spherically symmetric, with only one component in the spherical coordinates: E(r)= E(r)nr, where nr
r/r is the unit vector in the direction of the field observation point r. Taking this direction for the polar
axis of a spherical coordinate system, we can use the evident axial symmetry of the system to reduce Eq.
(9) to
6 See, e.g., MA Sec. 14. The 2D (areal) charge density and the 1D (linear) density may be defined absolutely
similarly to the 3D (volumic) density : dQ = d2r, dQ = dr. Note that the approximations in that either 0
or 0 imply that is formally infinite at the charge location; for example, the model in that a plane z = 0 is
charged with areal density 0, means that = (z), where (z) is Dirac’s delta function.
Chapter 1 Page 4 of 20
Essential Graduate Physics EM: Classical Electrodynamics
1 (r' )
E 2 sin 'd ' r' 2 dr' cos , (1.11)
4 0 0 0 R2
where , ’, and R are the geometrical parameters marked in Fig. 2. Since and R may be readily
expressed via r’ and ’, using the auxiliary parameters a and h,
ra
cos , R 2 h 2 (r r' cos ) 2 , where a r' cos ' , h r' sin ' , (1.12)
R
Eq. (11) may be eventually reduced to an explicit integral over r’ and ’, and worked out analytically,
but that would require some effort.
For other problems, the integral (9) may be much more complicated, defying an analytical
solution. One could argue that with the present-day abundance of computers and numerical algorithm
libraries, one can always resort to numerical integration. This argument may be enhanced by the fact
that numerical integration is based on the replacement of the required integral by a discrete sum, and the
summation is much more robust to the (unavoidable) rounding errors than the finite-difference schemes
typical for the numerical solution of differential equations. These arguments, however, are only partly
justified, since in many cases the numerical approach runs into a problem sometimes called the curse of
dimensionality – the exponential dependence of the number of needed calculations on the number of
independent parameters of the problem.7 Thus, despite the proliferation of numerical methods in
physics, analytical results have an everlasting value, and we should try to get them whenever we can.
For our current problem of finding the electric field generated by a fixed set of electric charges, large
help may come from the so-called Gauss law.
To derive it, let us consider a single point charge q inside a smooth closed surface S (Fig. 3), and
calculate the product End2r, where d2r is an elementary area of the surface (which may be well
approximated with a plane fragment of that area), and En En is the component of the electric field at
that point, normal to the plane.
(a) (b)
En E Eout
d 2r
S d 2 r' S
r d 0
d r cos
2
d
q Ein
d 0
q
Fig. 1.3. Deriving the Gauss law: a point charge q (a) inside the volume V, and (b) outside of that volume.
This component may be calculated as Ecos, where is the angle between the vector E and the
unit vector n normal to the surface. Now let us notice that the product cos d2r is nothing more than the
7 For a more detailed discussion of this problem, see, e.g., CM Sec. 5.8.
Chapter 1 Page 5 of 20
Essential Graduate Physics EM: Classical Electrodynamics
area d2r’ of the projection of d2r onto the plane normal to the vector r connecting the charge q with the
considered point of the surface (Fig. 3), because the angle between the elementary areas d2r’ and d2r is
also equal to . Using the Coulomb law for E, we get
1 q 2
E n d 2 r E cos d 2 r d r' . (1.13)
4 0 r 2
But the ratio d2r’/r2 is nothing more than the elementary solid angle d under which the areas d2r’ and
d2r are seen from the charge point, so End2r may be represented just as a product of d by a constant
(q/40). Summing these products over the whole surface, we get
q q
E d r dΩ
2
, (1.14)
4 0
n
S S 0
since the full solid angle equals 4. (The integral on the left-hand side of this relation is called the flux
of electric field through the surface S.)
Relation (14) expresses the Gauss law for one point charge. However, it is only valid if the
charge is located inside the volume V limited by the surface S. To find the flux created by a charge
located outside of this volume, we still can use Eq. (13), but have to be careful with the signs of the
elementary contributions EndA. Let us use the common convention to direct the unit vector n out of the
closed volume we are considering (the so-called outer normal), so the elementary product End2r =
(En)d2r and hence d = End2r’/r2 is positive if the vector E is pointing out of the volume (like in the
example shown in Fig. 3a and at the upper-right area in Fig. 3b), and negative in the opposite case (for
example, at the lower-left area in Fig. 3b). As the latter panel shows, if the charge is located outside of
the volume, for each positive contribution d there is always an equal and opposite contribution to the
integral. As a result, at the integration over the solid angle, the positive and negative contributions
cancel exactly, so
En d r 0.
2
(1.15)
S
The real power of the Gauss law is revealed by its generalization to the case of several,
especially many charges. Since the calculation of flux is a linear operation, the linear superposition
principle (4) means that the flux created by several charges is equal to the (algebraic) sum of individual
fluxes from each charge, for which either Eq. (14) or Eq. (15) are valid, depending on whether the
charge is in or out of the volume. As a result, for the total flux, we get:
QV 1 1
E dn
2
r
0
0
qj
0 (r' )d
3
r' , (1.16) Gauss
law
S r jV V
where QV is the net charge inside volume V. This is the full version of the Gauss law.8
In order to appreciate the problem-solving power of the law, let us revisit the problem shown in
Fig. 2, i.e. the field of a spherical charge distribution. Due to its symmetry, which had already been
discussed above, if we apply Eq. (16) to a sphere of a certain radius r, the electric field has to be normal
8The law is named after the famed Carl Gauss (1777-1855), even though it was first formulated earlier (in 1773)
by Joseph-Louis Lagrange who was also the father-founder of analytical mechanics – see, e.g., CM Chapter 2.
Chapter 1 Page 6 of 20
Essential Graduate Physics EM: Classical Electrodynamics
to the sphere at each point (i.e., En = E), and its magnitude has to be the same at all points: En = E(r). As
a result, the flux calculation is elementary:
. (1.17)
E n d r 4 r E (r )
2 2
0 r ' r
0 0
so, finally,
r
1 1 Qr
E (r ) r' (r' )dr'
2
, (1.19)
r 02
0
4 0 r 2
where Qr is the full charge inside the sphere of radius r:
r
Qr r' d r' 4 (r' )r' 2 dr' .
3
(1.20)
r' r 0
In particular, this formula shows that the field outside of a sphere of a finite radius R is exactly
the same as if all its charge Q = Q(R) is concentrated in the sphere’s center. (Note that this important
result is only valid for a spherically symmetric charge distribution.) For the field inside the sphere,
finding the electric field still requires the explicit integration (20), but this 1D integral is much simpler
than the 2D integral (11), and in some important cases may be readily worked out analytically. For
example, if the charge Q is uniformly distributed inside a sphere of radius R,
Q Q (1.21)
(r' ) ,
V (4 / 3) R 3
then the integration is elementary:
r 2 r 1 Qr
E (r )
r 0 0
2
r' dr'
3 0 4 0 R 3
. (1.22)
We see that in this case, the field is growing linearly from the center to the sphere’s surface, and only at
r > R starts to decrease in agreement with Eq. (19) with constant Q(r) = Q. Note also that the electric
field is continuous for all r (including r = R) – as for all systems with finite volumic density,
In order to underline the importance of the last condition, let us consider one more elementary
but very important example of Gauss law’s application. Let a thin plane sheet (Fig. 4) be charged
uniformly, with a finite areal density = const. In this case, it is fruitful to use the Gauss volume in the
form of a planar “pillbox” of thickness 2z (where z is the Cartesian coordinate perpendicular to the
plane) and certain area A – see the dashed lines in Fig. 4. Due to the symmetry of the problem, it is
evident that the electric field should be: (i) directed along the z-axis, (ii) constant on each of the upper
and bottom sides of the pillbox, (iii) equal and opposite on these sides, and (iv) parallel to the side
surfaces of the box. As a result, the full electric field flux through the pillbox’s surface is just 2AE(z), so
the Gauss law (16) yields 2AE(z) = QA/0 A/0, and we get a very simple but important formula
E( z) const. (1.23)
2 0
Chapter 1 Page 7 of 20
Essential Graduate Physics EM: Classical Electrodynamics
E
z
Notice that, somewhat counter-intuitively, the field magnitude does not depend on the distance
from the charged plane. From the point of view of the Coulomb law (5), this result may be explained as
follows: the farther the observation point from the plane, the weaker the effect of each elementary
charge, dQ = d2r, but the more such elementary charges give contributions to the z-component of
vector E, because they are “seen” from the observation point at relatively small angles to the z-axis.
Note also that though the magnitude E E of this electric field is constant, its component En
normal to the plane (for our coordinate choice, Ez) changes its sign at the plane, experiencing a
discontinuity (jump) equal to
E z E z z 0 E z z 0 . (1.24)
0
This jump disappears if the surface is not charged. Returning for a split second to our charged sphere
problem (Fig. 2), solving it we have considered the volumic charge density to be finite everywhere,
including the sphere’s surface, so on it = 0, and the electric field should be continuous – as it is.
Admittedly, the integral form (16) of the Gauss law is immediately useful only for highly
symmetrical geometries, such as in the two problems discussed above. However, it may be recast into an
alternative, differential form whose field of useful applications is much wider. This form may be
obtained from Eq. (16) using the divergence theorem of the vector algebra, which is valid for any space-
differentiable vector, in particular E, and for the volume V limited by any closed surface S:9
E d r ( E)d 3 r ,
2
n (1.25)
S V
where is the del (or “nabla”) operator of spatial differentiation.10 Combining Eq. (25) with the Gauss
law (16), we get
3
V E 0 d r 0. (1.26)
For a given spatial distribution of electric charge (and hence of its electric field), this equation should be
valid for any choice of the volume V. This can hold only if the function under the integral vanishes at
each point, i.e. if11
9 See, e.g., MA Eq. (12.2). Note also that the scalar product under the volumic integral in Eq. (25) is nothing else
than the divergence of the vector E – see, e.g., MA Eq. (8.4), hence the theorem’s name.
10 See, e.g., MA Secs. 8-10.
11In the Gaussian units, just as in the initial Eq. (6), 0 has to be replaced with 1/4, so the Maxwell
equation (27) looks like E = 4, while Eq. (28) stays the same.
Chapter 1 Page 8 of 20
Essential Graduate Physics EM: Classical Electrodynamics
Inhomo-
geneous
Maxwell E . (1.27)
equation 0
for E
Note that in sharp contrast with the integral form (16), Eq. (27) is local: it relates the electric field’s
divergence to the charge density at the same point. This equation, being the differential form of the
Gauss law, is frequently called one of the famed Maxwell equations12 – to be discussed again and again
later in this course.
In the mathematical terminology, Eq. (27) is inhomogeneous, because it has a right-hand side
independent (at least explicitly) of the field E that it describes. Another, homogeneous Maxwell
equation’s “embryo” (this one valid for the stationary case only!) may be obtained by noticing that the
Homo- curl of the point charge’s field, and hence that of any system of charges, equals zero:13
geneous
Maxwell
equation
E 0. (1.28)
for E
(We will arrive at two other Maxwell equations, for the magnetic field, in Chapter 5, and then generalize
all the equations to their full, time-dependent form at the end of Chapter 6. However, Eq. (27) will stay
the same.)
Just to get a better gut feeling of Eq. (27), let us apply it to the same example of a uniformly
charged sphere (Fig. 2). Vector algebra tells us that the divergence of a spherically symmetric vector
function E(r) = E(r)nr may be simply expressed in spherical coordinates:14 E = [d(r2E)/dr]/r2. As a
result, Eq. (27) yields a linear ordinary differential equation for the scalar function E(r):
1 d 2 / 0 , for r R,
(r E ) (1.29)
2
r dr 0, for r R,
which may be readily integrated on each of these segments:
1 1 r 2 dr r 3 / 3 c1 , for r R,
E (r ) (1.30)
0 r 2 c 2 , for r R.
To determine the integration constant c1, we can use the following boundary condition: E(0) = 0. (It
follows from the problem’s spherical symmetry: in the center of the sphere, the electric field has to
vanish, because otherwise, where would it be directed?) This requirement gives c1 = 0. The second
constant, c2, may be found from the continuity condition E(R – 0) = E(R + 0), which has already been
discussed above, giving c2 = R3/3 Q/4. As a result, we arrive at our previous results (19) and (22).
We can see that in this particular, highly symmetric case, using the differential form of the Gauss
law is a bit more complex than its integral form. (For our second example, shown in Fig. 4, it would be
even less natural.) However, Eq. (27) and its generalizations are more convenient for asymmetric charge
12 Named after the genius of classical electrodynamics and statistical physics, James Clerk Maxwell (1831-1879).
13 This follows, for example, from the direct application of MA Eq. (10.11) to any spherically-symmetric vector
function of type f(r) = f(r)nr (in particular, to the electric field of a point charge placed at the origin), giving f = f
= 0 and fr/ = fr/ = 0 so all components of the vector f vanish. Since nothing prevents us from placing
the reference frame’s origin at the point charge’s location, this result remains valid for any position of the charge.
14 See, e.g., MA Eq. (10.10) for the particular case / = / = 0.
Chapter 1 Page 9 of 20
Essential Graduate Physics EM: Classical Electrodynamics
distributions, and are invaluable in cases where the distribution (r) is not known a priori and has to be
found in a self-consistent way. (We will start discussing such cases in the next chapter.)
To calculate the scalar potential, let us start from the simplest case of a single point charge q
placed at the origin. For it, Eq. (7) takes the simple form
1 r 1 n
E q q 2r . (1.34)
4 0 r 3
4 0 r
It is straightforward to verify that the last fraction in the last form of Eq. (34) is equal to –(1/r).17
Hence, according to the definition (33), for this particular case
1 q Potential of a
. (1.35) point charge
4 0 r
(In the Gaussian units, this result is spectacularly simple: = q/r.) Note that we could add an arbitrary
constant to this potential (and indeed to any other distribution of discussed below) without changing
the field, but it is convenient to define the potential energy so it would approach zero at infinity.
In order to justify the introduction and the forthcoming exploration of U and , let me
demonstrate (I hope, unnecessarily :-) how useful the notions are, on a very simple example. Let two
similar charges q be launched from afar, with the same initial speed v0 << c each, straight toward each
other (i.e. with the zero impact parameter) – see Fig. 5. Since, according to the Coulomb law, the
Chapter 1 Page 10 of 20
Essential Graduate Physics EM: Classical Electrodynamics
charges repel each other with increasing force, they will stop at some minimum distance rmin from each
other, and then fly back. We could of course find rmin directly from the Coulomb law. However, for that,
we would need to write the 2nd Newton law for each particle (actually, due to the problem symmetry,
they would be similar), then integrate them over time to find the particle velocity v as a function of
distance, and only then recover rmin from the requirement v = 0.
v0 v0
m, q rmin ? m, q
The notion of potential allows this problem to be solved in one line. Indeed, in the field of
potential forces, the system’s total energy E = T + U T + q is conserved. In our non-relativistic case v
<< c, the kinetic energy T is just mv2/2. Hence, equating the total energy of two particles at the points r
= and r = rmin, and using Eq. (35) for , we get
mv02 1 q2
2 0 0 , (1.36)
2 4 0 rmin
immediately giving us the final answer: rmin = q2/40mv02. So, the notion of scalar potential is indeed
very useful.
With this motivation, let us calculate for an arbitrary configuration of charges. For a single
charge in an arbitrary position (say, at point rk’), r r in Eq. (35) should be evidently replaced with
r – rk’. Now, the linear superposition principle (3) allows for an easy generalization of this formula to
the case of an arbitrary set of discrete charges,
1 qk '
(r )
4 0 rk ' r r rk '
. (1.37)
Finally, using the same arguments as in Sec. 1, we can use this result to argue that in the case of an
arbitrary continuous charge distribution
Potential 1 (r' ) 3
of a charge
distribution
. (r )
4 0 r r'
d r' . (1.38)
Again, Dirac’s delta function allows using the last equation to recover Eq. (37) for discrete charges as
well, so Eq. (38) may be considered as the general expression for the electrostatic potential.
For most practical calculations, using this expression and then applying Eq. (33) to the result, is
preferable to using Eq. (9), because is a scalar, while E is a 3D vector, mathematically equivalent to
three scalars. Still, this approach may lead to technical problems similar to those discussed in Sec. 2. For
example, applying it to the spherically symmetric distribution of charge (Fig. 2), we get the integral
1 (r' )
2 sin 'd ' r' 2 dr' cos , (1.39)
4 0 0 0
R
which is not much simpler than Eq. (11).
Chapter 1 Page 11 of 20
Essential Graduate Physics EM: Classical Electrodynamics
The situation may be much improved by recasting Eq. (38) into a differential form. For that, it is
sufficient to plug the definition of , Eq. (33), into Eq. (27):
( ) . (1.40)
0
The left-hand side of this equation is nothing else than the Laplace operator of (with the minus sign),
so we get the famous Poisson equation18 for the electrostatic potential:
Poisson
2 . (1.41) equation
0 for
(In the Gaussian units, the Poisson equation is 2 = –4.) This differential equation is so convenient
for applications that even its particular case for = 0,
Laplace
2 0 , (1.42) equation
for
18 Named after Siméon Denis Poisson (1781-1840), also famous for the Poisson distribution – one of the central
results of the probability theory – see, e.g., SM Sec. 5.2.
19 Named after the famous mathematician (and astronomer) Pierre-Simon Laplace (1749-1827) who, together with
Alexis Clairault, is credited for the development of the very concept of potential.
20 See, e.g., MA Eq. (10.8) for / = / = 0.
Chapter 1 Page 12 of 20
Essential Graduate Physics EM: Classical Electrodynamics
Before making any judgment on the integration constant c1, let us solve the Poisson equation (in
this case, just the Laplace equation) for the range outside the sphere (r > R):
1 d 2 d
r 0. (1.47)
r 2 dr dr
Its first integral,
d c
(r ) 22 , (1.48)
dr r
also gives the electric field (with the minus sign). Now using Eq. (45) and requiring the field to be
continuous at r = R, we get
c2 Q d Q
, i.e. (r ) , (1.49)
R 2
4 0 R 2
dr 4 0 r 2
in an evident agreement with Eq. (19). Integrating this result again,
Q dr Q
(r )
4 0 r 2
4 0 r
c3 , for r R, (1.50)
we can select c3 = 0, so () = 0, in accordance with the usual (though not compulsory) convention.
Now we can finally determine the constant c1 in Eq. (46) by requiring that this equation and Eq. (50)
give the same value of at the boundary r = R. (According to Eq. (33), if the potential had a jump, the
electric field at that point would be infinite.) The final answer may be represented as
Q R2 r 2
(r ) 1, for r R. (1.51)
4 0 R 2 R 2
This calculation shows that using the Poisson equation to find the electrostatic potential
distribution for highly symmetric problems may be a bit more cumbersome than directly finding the
electric field – say, from the Gauss law. However, we will repeatedly see below that if the electric
charge distribution is not fixed in advance, using Eq. (41) may be the only practicable way to proceed.
Returning now to the general theory of electrostatic phenomena, let us calculate the potential
energy U of an arbitrary system of point electric charges qk. Despite the apparently simple relation (31)
between U and , the result is not that straightforward. Indeed, let us assume that the charge distribution
has a finite spatial extent, so at large distances from it (formally, at r = ) the electric field tends to zero,
so the electrostatic potential tends to a constant. Selecting this constant, for convenience, to equal zero,
we may calculate U as a sum of the energy increments Uk created by bringing the charges, one by one,
from infinity to their final positions rk – see Fig. 6.21 According to the integral form of Eq. (32), such a
contribution is
rk rk
U k F(r ) dr q k E(r ) dr q k rk , (1.52)
21 Indeed, by the very definition of the potential energy of a system, it should not depend on the way we are
arriving at its final configuration.
Chapter 1 Page 13 of 20
Essential Graduate Physics EM: Classical Electrodynamics
where E(r) is the total electric field, and (r) is the total electrostatic potential during this process,
besides the field created by the very charge qk that is being moved.
from
q1, r1 qk , rk
q2 , r2
external
charges E ext r q3 , r3 qk ' , rk ' , Fig. 1.6. Deriving Eqs. (55) and
with k' k (60) for potential energies of a
system of several point charges.
the system of charges under analysis
This expression shows that the increment Uk, and hence the total potential energy U, depends
on the source of the electric field E. If the field is dominated by an external field Eext, induced by some
external charges, not being a part of the charge configuration under our analysis (whose energy we are
calculating, see Fig. 6), then the spatial distribution (r) is determined by this field, i.e. does not depend
on how many charges we have already brought in, so Eq. (52) is reduced to
r
U k q k ext rk , where ext (r ) E ext (r' ) dr' . (1.53)
Summing up these contributions, we get what is called the charge system’s energy in the external
field:22
U ext U k q k ext rk . (1.54)
k k
Now repeating the argumentation that has led us to Eq. (9), we see that for a continuously distributed
charge, this sum turns into an integral:
Energy:
U ext (r ) ext (r )d 3 r . (1.55) external
field
(As was discussed above, using the delta-functional representation of point charges, we may always
return from here to Eq. (54), so Eq. (55) may be considered as a final, universal result.)
The result is different in the opposite limit when the electric field E(r) is created only by the very
charges whose energy we are calculating. In this case, (rk) in Eq. (52) is the potential created only by
the charges with numbers k’ = 1, 2, …, (k – 1) that are already in place when the kth charge is moved in
(in Fig. 6, the charges inside the dashed boundary), and we may use the linear superposition principle to
write
U k q k k' (rk ), so that U U k q k k ' (rk ) . (1.56)
k' k k k ,k '
( k ' k )
This result is so important that it is worthy of rewriting in several other forms. First, we may use Eq.
(35) to represent Eq. (56) in a more symmetric form:
22 An alternative, perhaps more accurate term for Uext is the energy of the system’s interaction with the external
field.
Chapter 1 Page 14 of 20
Essential Graduate Physics EM: Classical Electrodynamics
1 qk qk '
U
4 0
k ,k ' rk rk '
. (1.57)
( k ' k )
The expression under this sum is evidently symmetric with respect to the index swap, so it may be
extended into a different form,
1 1 qk qk '
U
4 0 2 k ',k rk rk '
, (1.58)
( k ' k )
where the interaction between each couple of charges is described by two equal terms under the sum,
and the front coefficient ½ is used to compensate for this double-counting. The convenience of the last
form is that it may be readily generalized to the continuous case:
1 1 3 (r ) (r' )
U
4 0 2 d r d 3 r'
r r'
. (1.59)
(As before, in this case, the restriction expressed in the discrete charge case as k k’ is not important,
because if the charge density is a continuous function, the integral (59) does not diverge at point r = r’.)
To represent this result in one more form, let us notice that according to Eq. (38), the inner
integral over r’ in Eq. (59), divided by 4 0, is just the full electrostatic potential at point r, and hence
Energy: 1
2
charge U (r ) (r )d 3 r . (1.60)
interaction
but here it is important to remember that the “full” potential’s value (rk) should exclude the (infinite)
contribution from the point charge k itself. Comparing the last two formulas with Eqs. (54) and (55), we
see that the electrostatic energy of charge interaction within the system, as expressed via the charge-by-
potential product, is twice less than that of the energy of charge interaction with a fixed (“external”)
field. This is the result of the fact that in the case of mutual interaction of the charges, the electric field E
in the basic Eq. (52) is proportional to the charge’s magnitude, rather than constant.23
Now we are ready to address an important conceptual question: can we locate this interaction
energy in space? This task may seem trivial: Eqs. (58)-(61) seem to imply that non-zero contributions to
U come only from the regions where the electric charges are located. However, one of the most beautiful
features of physics is that sometimes completely different interpretations of the same mathematical
result are possible. To get an alternative view of our current result, let us write Eq. (60) for a volume V
so large that the electric field on the limiting surface S is negligible, and plug into it the charge density
expressed from the Poisson equation (41):
0
2
U d 2 3
r. (1.62)
V
23 The nature of this additional factor ½ is absolutely the same as in the well-known formula U = (½)x2 for the
potential energy of an elastic spring providing the returning force F = –x, proportional to its displacement x from
the equilibrium position.
Chapter 1 Page 15 of 20
Essential Graduate Physics EM: Classical Electrodynamics
certainly invites an interpretation very much different than Eq. (60): it is natural to consider u(r) as the
spatial density of the electric field energy, which is continuously distributed over all the space where the
field exists – rather than just its part where the charges are located.
Let us have a look at how these two alternative pictures work for our testbed problem, a
uniformly charged sphere. If we start with Eq. (60), we may limit the integration by the sphere volume
(0 r R) where 0. Using Eq. (51), and the spherical symmetry of the problem (giving d3r =
4r2dr), we get
1
R
1 Q
R
R2 r 2 2 6 1 Q2
U 4 r 2 dr 4
4 0 R 0 2 R 2
1
r dr . (1.66)
2 0 2 5 4 0 2 R
On the other hand, if we use Eq. (65), we need to integrate the energy density everywhere, i.e. both
inside and outside of the sphere:
0 R 2 2
U 4 E r dr E 2 r 2 dr . (1.67)
2 0 R
Using Eqs. (19) and (22) for, respectively, the external and internal regions, we get
0 R Qr 2
2
Q 2 1 1 Q2
2
U 4 r dr r dr 1 . (1.68)
0 4 0 R 4 0 r 5 4 0 2 R
2
2
This is (fortunately :-) the same answer as given by Eq. (66), but to some extent, Eq. (68) is more
informative because it shows how exactly the electric field’s energy is distributed between the interior
and exterior of the charged sphere.26
24 This transformation follows from the divergence theorem MA (12.2) applied to the vector function f = ,
taking into account the differentiation rule MA Eq. (11.4a): () = ()() + () = ()2 + 2.
25 In the Gaussian units, the standard replacement 1/4 turns the last of Eqs. (65) into u(r) = E2/8.
0
26 Note that U at R 0. Such divergence appears at the application of Eq. (65) to any point charge. Since it
does not affect the force acting on the charge, the divergence does not create any technical difficulty for analysis
of charge statics or non-relativistic dynamics, but it points to a possible conceptual problem of classical
electrodynamics as a whole at describing point charges. This issue will be discussed at the very end of the course
(Sec. 10.6).
Chapter 1 Page 16 of 20
Essential Graduate Physics EM: Classical Electrodynamics
We see that, as we could expect, within the realm of electrostatics, Eqs. (60) and (65) are
equivalent. However, when we examine electrodynamics (in Chapter 6 and beyond), we will see that the
latter equation is more general and that it is more adequate to associate the electric energy with the field
itself rather than its sources – in our current case, the electric charges.
Finally, let us calculate the potential energy of a system of charges in the general case when both
the internal interaction of the charges and their interaction with an external field are important. One
might fancy that such a calculation should be very hard since, in both ultimate limits, when one of these
interactions dominates, we have gotten different results. However, once again we get help from the
almighty linear superposition principle: in the general case, for the total electric field we may write
where the index “int” now marks the field induced by the charge system under analysis, i.e. the variables
participating (without indices) in Eqs. (56)-(65). Now let us imagine that our system is being built up in
the following way: first, the charges are brought together at Eext = 0, giving the potential energy Uint
expressed by Eq. (60), and then Eext is slowly increased. Evidently, the energy contribution from the
latter process cannot depend on the internal interaction of the charges, and hence may be expressed in
the form (55). As a result, the total potential energy27 is the sum of these two components:
1
U U int U ext
2
(r )int (r )d 3 r (r ) ext (r )d 3 r . (1.70)
Now making the transition from the potentials to the fields, absolutely similar to that performed in Eqs.
(62)-(65), we may rewrite this expression as
0
U u (r )d 3 r , with u (r )
2
E 2
int
(r ) 2E int r E ext r . (1.71)
One might think that this result, more general than Eq. (65) and perhaps less familiar to the
reader, is something entirely new; however, it is not. Indeed, let us add to, and subtract Eext2(r) from the
sum in the brackets, and use Eq. (69) for the total electric field E(r); then Eq. (71) takes the form
0 0
U E r d r E r d
2 3 2 3
ext r. (1.72)
2 2
Hence, in the most important case when we are using the potential energy to analyze the statics and
dynamics of a system of charges in a fixed external field, i.e. when the second term on the right-hand
side of Eq. (72) may be considered as a constant, we may still use for U an expression similar to the
familiar Eq. (65), but with the field E(r) being the sum (69) of the internal and external fields.
Let us see how this works in a very simple situation. A uniform external electric field Eext is
applied normally to a very broad, plane layer that contains a very large and equal number of free electric
charges of both signs – see Fig. 7. What is the equilibrium distribution of the charges over the layer?
27This total U (or rather its part dependent on our system of charges) is sometimes called the Gibbs potential
energy of the system. (I will discuss this notion in detail in Sec. 3.5.)
Chapter 1 Page 17 of 20
Essential Graduate Physics EM: Classical Electrodynamics
E ext
Fig. 1.7. A simple model of the electric
field screening in a conductor. Here
Eint (and in all figures below) the red and
blue colors are used to denote the
opposite charge signs.
Since any area-uniform distribution of the charge inside the layer does not affect the field (and
hence its energy) outside it, and the equilibrium distribution has to minimize the total potential energy of
the system, Eq. (72) immediately gives the answer: the distribution should provide E Eint + Eext = 0
inside the whole layer – the effect called the electric field screening. The only way to ensure this
equality is to have enough free charges of opposite signs residing on the layer’s surfaces to induce a
uniform field Eint = –Eext, exactly compensating the external field at each point inside the layer – see
Fig. 7. According to Eq. (24), the areal density of these surface charges should equal , with =
Eext/0. This is a rudimentary but reasonable model of conductors’ polarization – to be discussed in
detail in the next chapter.
1.1. Calculate the electric field of a thin, long, straight filament, electrically charged with a
constant linear density , by using two approaches:
(i) directly from the Coulomb law, and
(ii) from the Gauss law.
1.3. Calculate the electric field of the following spherically symmetric charge distribution: (r) =
0exp{–r}.
1.4. A sphere of radius R, whose volume had been charged with a constant density , is split with
a very narrow planar gap passing through its center. Calculate the force of the mutual electrostatic
repulsion of the resulting two hemispheres.
1.5. A thin spherical shell of radius R, which had been charged with a constant areal density , is
split into two equal halves with a very thin planar cut passing through the sphere’s center. Calculate the
force of electrostatic repulsion between the resulting hemispheric shells, and compare the result with
that of the previous problem.
1.6. Calculate the spatial distribution of the electrostatic potential created by a straight thin
filament of a finite length 2l, charged with a constant linear density , and explore the result in the limits
of very small and very large distances from the filament.
Chapter 1 Page 18 of 20
Essential Graduate Physics EM: Classical Electrodynamics
1.7. A thin planar sheet, perhaps of an irregular shape, carries an electric charge with a constant
areal density .
(i) Express the electric field’s component normal to the plane, at a certain distance from it, via
the solid angle at which the sheet is visible from the observation point.
(ii) Use the result to calculate the field in the center of a cube with one face charged with a
constant density .
1.8. Can one create, in an extended region of space, electrostatic fields with the Cartesian
components proportional to the following products of the Cartesian coordinates {x, y, z}:
(i) yz, xz, xy,
ii xy, xy, yz?
1.9. Distant sources have been used to create different uniform electrostatic fields in two half-
spaces: z
E , at z 0,
Er r R nz E
E , at z 0,
except for a transitional region of scale R near the origin, where the field is
perturbed but still axially symmetric. (As will be discussed in the next
chapter, this may be done, for example, using a thin conducting membrane E 2R
with a round hole of radius R in it – see the figure on the right.) Prove that
such field may serve as an electrostatic lens for charged particles flying along the z-axis, at distances
<< R from it, and calculate the focal distance f of this lens. Spell out the conditions of validity of your
result.
q r3
1.10. Eight equal point charges q are located in the corners of a cube of
side a. Calculate all Cartesian components Ej of the electric field, and their spatial
derivatives Ej/rj’, in the cube’s center, where rj are the Cartesian coordinates 0 r2
oriented along the cube’s sides – see the figure on the right. Are all of your results
r1
valid for the center of a planar square, with four equal charges at its corners?
a
1.11. By a direct calculation, find the average electric potential of a spherical surface of radius R,
created by a point charge q located at a distance r > R from the sphere’s center. Use the result to prove
the following general mean value theorem: the electric potential at any point is always equal to its
average value on any spherical surface with the center at that point while containing no electric charges
inside it.
Chapter 1 Page 19 of 20
Essential Graduate Physics EM: Classical Electrodynamics
1.14. A thin, flat, rectangular sheet of size ab is electrically charged with a constant areal
density . Without an explicit calculation of the spatial distribution (r) of the electrostatic potential
induced by this charge, find the ratio of its values in the center and in the corners of the rectangle.
Hint: Consider partitioning the rectangle into several similar parts and using the linear
superposition principle.
1.15. Calculate the electrostatic energy per unit area of the system of two thin, parallel planes
with equal and opposite charges of a constant areal density , separated by distance d.
1.16. The system analyzed in the previous problem (two thin, E ext
parallel, oppositely charged planes) is now placed into an external,
uniform, normal electric field Eext = /0 – see the figure on the right. Find d
the force (per unit area) acting on each plane, by two methods:
(i) directly from the electric field distribution, and
(ii) from the potential energy of the system.
1.17. Explore the relationship between the Laplace equation (42) and the minimum of the
electrostatic field energy (65).
1.18. Prove the following reciprocity theorem of electrostatics:28 if two spatially-confined charge
distributions 1(r) and 2(r) create, respectively, electrostatic potentials 1(r) and 2(r), then
r r d r 2 r 1 r d 3 r .
3
1 2
1.19. Calculate the energy of the electrostatic interaction of two spheres, of radii R1 and R2, each
with a spherically symmetric charge distribution, separated by distance d > R1 + R2.
R1 R2
1.20. Calculate the electrostatic energy U of a (generally, thick) spherical shell, Q
with charge Q uniformly distributed through its volume – see the figure on the right.
Interpret the dependence of U on the inner cavity’s radius R1, at fixed Q and R2.
28 This is only the simplest one of several reciprocity theorems in electromagnetism – see, e.g., Sec. 6.8 below.
Chapter 1 Page 20 of 20
Essential Graduate Physics EM: Classical Electrodynamics
(a) (b)
Fig. 2.1. Two typical electrostatic
situations involving conductors:
(a) polarization by an external
field, and (b) re-distribution of the
conductor’s own charge over its
surface – schematically. Here and
below, the red and blue points
denote charges of opposite signs.
The full solution of such problems should satisfy not only the fundamental Eq. (1.7) but also the
so-called constitutive relations between the macroscopic variables describing the sample’s material.
Due to the atomic character of real materials, such relations may be very involved. In this part of my
series, I will have time to address these relations, for various materials, only rather superficially,1
focusing on their simple approximations. Fortunately, in most practical cases such approximations work
very well.
In particular, for the polarization of good conductors, a very reasonable approximation is given
by the so-called macroscopic model, in which the free charges in the conductor are treated as a charged
continuum that is free to move under the effect of the force F = qE exerted by the macroscopic electric
field E, i.e. the field averaged over space on the atomic scale – see also the discussion at the end of Sec.
1A more detailed discussion of the electrostatic field screening may be found, e.g., in SM Sec. 6.4. (Alternatively,
see either Sec. 13.5 of J. Hook and H. Hall, Solid State Physics, 2nd ed., Wiley, 1991; or Chapter 17 of N.
Ashcroft and N. Mermin, Solid State Physics, Brooks Cole, 1976.)
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
1.1. In electrostatics (which excludes the case dc currents, to be discussed in Chapter 4 below), there
should be no such motion, so everywhere inside the conductor the macroscopic electric field should
vanish:
E 0. (2.1a)
This is the electric field screening2 effect, meaning, in particular, that conductors’ polarization in an Conductor:
external electric field has the extreme form shown (rather schematically) in Fig. 1a, with the field of the macroscopic
model
induced surface charges completely compensating the external field in the conductor’s bulk. Note that
Eq. (1a) may be rewritten in another, frequently more convenient form:
const , (2.1b)
where is the macroscopic electrostatic potential related to the macroscopic field by Eq. (1.33).3 (If a
problem includes several unconnected conductors, the constant in Eq. (1b) may be specific for each of
them.)
Now let us examine what we can say about the electric field in free space just outside a
conductor, within the same macroscopic model. At close proximity, any smooth surface (in our current
case, that of a conductor) looks planar. Let us integrate Eq. (1.28) over a narrow (d << l) rectangular
loop C encircling a part of such plane conductor’s surface (see the dashed line in Fig. 2a), and apply it to
the electric field vector E the well-known vector algebra equality – the Stokes theorem4
E
S
n d 2 r E dr ,
C
(2.2)
(a) (b)
E
E 0 const
free space C
c
d
conductor l c
E0
Fig. 2.2. (a) The surface charge layer at a conductor’s surface, and
(b) the electric field lines and equipotential surfaces near it.
In our current case, the contour is dominated by two straight lines of length l, so if l is much
smaller than the characteristic spatial scale of the field’s changes but much larger than the interatomic
distances, the right-hand side of Eq. (2) may be well approximated as [(E)in – (E)out]l, where E is the
tangential component of the corresponding macroscopic field, parallel to the surface. On the other hand,
according to Eq. (1.28), the left-hand side of Eq. (2) equals zero. Hence, the macroscopic field’s
2 This term, used for the electric field, should not be confused with shielding – the term used for the description of
magnetic field’s reduction by magnetic materials – see Chapter 5 below.
3 Since averaging of a function over space is a linear operation, any linear relation between genuine (microscopic)
variables, including Eq. (1.33), is also valid for the corresponding macroscopic variables.
4 See, e.g., MA Eq. (12.1).
Chapter 2 Page 2 of 68
Essential Graduate Physics EM: Classical Electrodynamics
component E should be continuous at the surface, and to satisfy Eq. (1a) inside the conductor, the
component has to vanish immediately outside it: (E)out = 0. This means that the electrostatic potential
immediately outside of a conducting surface cannot change along it. In other words, the equipotential
surfaces outside a conductor should “lean” to the conductor’s surface, with their potential values
approaching the constant potential of the conductor – see Fig. 2b.
So, the electrostatic field just outside any conductor has to be normal to its surface. To find this
normal field, we may apply the universal relation (1.24) to our macroscopic field E. Since in our current
case En = 0 inside the conductor, we get
Surface
charge 0 E n out 0 n 0 , (2.3)
density n
where is the macroscopic areal density of the conductor’s surface charge. Note that deriving this
universal relation between the normal component of the field and the surface charge density, we have
not used any cause-vs-effect arguments, so Eq. (3) is valid regardless of whether the surface charge is
induced by an externally applied field (as in the case of conductor’s polarization, shown in Fig. 1a), or
the electric field is induced by the electric charge placed on the conductor and then self-redistributed
over its surface (Fig. 1b), or it is some combination of both effects.
Before starting to use the macroscopic model for the solution of particular problems of
electrostatics, let me use the balance of this section to briefly discuss its limitations. (The reader in a
rush may skip this discussion and proceed to Sec. 2; however, I believe that every educated physicist has
to understand when this model works, and when it does not.)
Since the argumentation which has led us to Eq. (1.24) and hence to Eq. (3) is valid for any
thickness d of the Gauss pillbox, within the macroscopic model, the whole surface charge is located
within an infinitely thin surface layer. This is of course impossible physically: for one, this would
require an infinite volumic density of the charge. In reality, the charged layer (and hence the region of
the electric field’s crossover from the finite value (3) to zero) has a nonzero thickness . At least three
effects contribute to .
(i) Atomic structure of matter. Within each atom, and frequently between the adjacent atoms as
well, the genuine (“microscopic”) electric field is highly non-uniform. Thus, as was already stated
above, Eq. (1) is valid only for the macroscopic field, i.e. the field averaged over distances of the order
of the atomic size scale a0 ~ 10-10 m,5 and cannot be applied to the field changes on that scale. As a
result, the surface layer of charges cannot be much thinner than a0.
(ii) Thermal excitation. According to Eq. (1.9), in the whole field-free bulk of a conductor, the
net charge density, = e(n – ne), 6 has to vanish, so the numbers of protons in atomic nuclei (n) and
electrons (ne) per unit volume have to be balanced. However, if an external electric field penetrates a
conductor, free electrons can shift in or out of its affected part, depending on the field’s contribution to
their potential energy, U = qe = –e. (Here the arbitrary constant in is chosen to give = 0 well
inside the conductor.) In classical statistics, this change is described by the Boltzmann distribution:7
5 This scale originates from the quantum-mechanical effects of electron motion, characterized by the Bohr radius
rB 2/me(e2/40) 0.5310-10 m – see, e.g., QM Eq. (1.10). It also defines the scale EB = e/40rB2 ~ 1012 SI
units (V/m) of the microscopic electric fields inside atoms. (Please note how large these fields are.)
6 In this series, e denotes the fundamental charge, e 1.610-19 C > 0, so that the electron’s charge equals (–e).
7 See, e.g., SM Sec. 3.1.
Chapter 2 Page 3 of 68
Essential Graduate Physics EM: Classical Electrodynamics
U (r )
ne r n exp , (2.4)
k BT
where T is the absolute temperature in kelvins (K), and kB 1.3810-23 J/K is the Boltzmann constant.
As a result, the net charge density is
e r
r en 1 exp . (2.5)
k BT
The penetrating electric field polarizes the atoms as well. As will be discussed in the next chapter, such
polarization results in the reduction of the electric field by a material-specific dimensionless factor
(larger, but typically not too much larger than 1), called the dielectric constant. As a result, the Poisson
equation (1.41) takes the so-called Poisson-Boltzmann form,8
d 2 en e
exp 1, (2.6)
dz 2
0 0 k BT
where we have taken advantage of the 1D geometry of the system to simplify the Laplace operator, with
the z-axis normal to the surface.
Even with this simplification, Eq. (6) is a nonlinear differential equation allowing an analytical
but rather bulky solution. Since our current goal is just to estimate the field penetration depth , let us
simplify the equation further by considering the low-field limit: e ~ e E << kBT. In this limit, we
may extend the exponent into the Taylor series, and keep only two leading terms (of which the first one
cancels with the following unity). As a result, Eq. (6) becomes linear,
d 2 en e d 2 1
, i.e. 2 , (2.7)
dz 2
0 k BT dz 2
where the constant , in this case, is called the Debye (or “Debye-Hückel”) screening length D:
0 k BT Debye
2D 2
. (2.8) screening
e n length
As the reader certainly knows, Eq. (7) describes an exponential decrease of the electric potential,
with the characteristic length D: exp{-z/D}, where the z-axis is directed into the conductor.
Plugging in the involved fundamental constants into Eq. (8), we get the following estimate: D[m]
70( T[K]/n[m-3])1/2. According to this formula, in semiconductors at room temperature, the Debye
length may be rather substantial. For example, in silicon ( 12) doped to the free charge carrier
concentration n = 31018 cm-3 (the value typical for modern integrated circuits),9 D 2 nm, still well
8 This equation and/or its straightforward generalization to the case of charged particles (ions) of several kinds is
also (especially in the theories of electrolytes and plasmas) called the Debye-Hückel equation.
9 There is a good reason for making an estimate of for this case: the electric field created by the gate electrode
D
of a field-effect transistor, penetrating into doped silicon by a depth ~D, controls the electric current in this most
important electronic device – on whose back all our information technology rides. Because of that, D establishes
the possible scale of semiconductor circuit shrinking, which is the basis of the well-known Moore’s law.
(Practically, the scale is determined by integrated circuit patterning techniques, and Eq. (8) may be used to find
the proper charge carrier density n and hence the necessary level of silicon doping – see, e.g., SM Sec. 6.4.)
Chapter 2 Page 4 of 68
Essential Graduate Physics EM: Classical Electrodynamics
above the atomic size scale a0, thus justifying the estimate However, for typical good metals (n ~ 1029
m-3, ~ 10) the same formula gives D ~ 10-11 m, less than a0. In this case, Eq. (8) should not be taken
literally, because it is based on the assumption of a continuous charge distribution.
(iii) Quantum statistics. Actually, the last estimate is not valid for good metals (and highly doped
semiconductors) for one more reason: their free electrons obey the quantum (Fermi-Dirac) statistics
rather than the Boltzmann distribution (4).10 As a result, at all realistic temperatures, the electrons form a
degenerate quantum gas, occupying all available energy states below some energy level EF >> kBT,
called the Fermi energy. In these conditions, the screening of a relatively low electric field may be
described by replacing Eq. (5) with
2.2. Capacitance
Let us start using the macroscopic model from systems consisting of charged conductors only,
with no so-called stand-alone charges in the free space outside them.12 Our goal here is to calculate the
10 See, e.g., SM Sec. 2.8. For a more detailed derivation of Eq. (10), see SM Chapter 3.
11 See, e.g., SM Sec. 3.3.
Chapter 2 Page 5 of 68
Essential Graduate Physics EM: Classical Electrodynamics
distributions of the electric field E and potential in space, and the distribution of the surface charge
density over the conductor surfaces. However, before doing that for particular situations, let us see if
there are any integral measures of these distributions, which should be our primary focus.
The simplest case is of course a single conductor in the otherwise free space. According to Eq.
(1b), all its volume should have the same electrostatic potential , evidently providing one convenient
global measure of the situation. Another integral measure is provided by the total charge
Q d 3r d 2 r , (2.11)
V S
where the last integral is extended over the whole surface S of the conductor. In the general case, what
can we tell about the relation between Q and ? At Q = 0, there is no electric field in the system, and it is
natural (though not absolutely necessary) to select the arbitrary constant in the electrostatic potential to
have = 0 everywhere. Then, if the conductor is charged with a non-zero Q, according to the linear Eq.
(1.7), the electric field at any point of space has to be proportional to that charge. Hence the electrostatic
potential at all points, including its value inside the conductor, is also proportional to Q:
pQ . (2.12)
The proportionality coefficient p, which depends on the conductor’s size and shape, but on neither nor
Q, is called its reciprocal capacitance (or, not too often, “electric elastance”). Usually, Eq. (12) is
rewritten in a different form,
1
Q C , with C , (2.13) Self-
capacitance
p
where C is called self-capacitance. (Frequently, C is called just capacitance, but as we will see very
soon, for more complex situations the latter term may be ambiguous.)
Before calculating C for particular geometries, let us have a look at the electrostatic energy U of
a single conductor. To calculate it, of the several relations discussed in Chapter 1, Eq. (1.61) is most
convenient, because all elementary charges qk are now parts of the conductor charge, and hence reside at
the same potential – see Eq. (1b) again. As a result, the equality becomes very simple:
1 1
U qk Q . (2.14)
2 k 2
Moreover, using the linear relation (13), the same result may be re-written in two more forms:
Q2 C 2 Electro-
U . (2.15) static
2C 2 energy
We will discuss several ways to calculate C in the next sections, and right now will have a quick
look at just the simplest example for that we have calculated everything necessary in the previous
chapter: a conducting sphere of radius R. Indeed, we already know the electric field distribution:
according to Eq. (1), E = 0 inside the sphere, while Eq. (1.19), with Q(r) = Q, describes the field
distribution outside it, because of the evident spherical symmetry of the surface charge distribution.
12In some texts, these charges are called “free”. This term is somewhat misleading, because they may well be
bound, i.e. unable to move freely.
Chapter 2 Page 6 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Moreover, since the latter formula is exactly the same as for the point charge placed in the sphere’s
center, the potential’s distribution in space may be obtained from Eq. (1.35) by replacing q with the
sphere’s full charge Q. Hence, on the surface of the sphere (and, according to Eq. (1b), through its
interior),
1 Q
. (2.16)
4 0 R
Comparing this result with the definition (13), for the sphere’s self-capacitance we obtain a very simple
formula13
C: shere C 4 0 R . (2.17)
This formula, which should be well familiar to the reader, is convenient to get some feeling of
how large the SI unit of capacitance (1 farad, abbreviated as F) is: the self-capacitance of Earth (RE
6.34106 m) is below 1 mF! Another important note is that while Eq. (17) is not exactly valid for a
conductor of arbitrary shape, it implies an important general estimate
C ~ 2 0 a (2.18)
where a is the scale of the linear size of any conductor.14
Now proceeding to a system of two arbitrary conductors, we immediately see why we should be
careful with the capacitance definition: one constant C is insufficient to describe all electrostatic
properties of such a system. Indeed, here we have two, generally different conductor potentials, 1 and
2, that may depend on both conductor charges, Q1 and Q2. Using the same arguments as for the single-
conductor case, we may conclude that the dependence is always linear:
1 p 11Q1 p 12 Q2 ,
(2.19)
2 p 21Q1 p 22 Q2 ,
but now has to be described by more than one coefficient. Actually, it turns out that there are three
rather than four different coefficients in these relations, because
p 12 p 21 . (2.20)
This equality may be proved in several ways, for example, using the general reciprocity theorem of
electrostatics (whose proof was the subject of Problem 1.17):
r r d r 2 r 1 r d 3 r ,
2 3
1 (2.21)
13 In the Gaussian units, using the standard replacement 40 1, this relation takes an even simpler form: C =
R, very easy to remember. Generally, in the Gaussian units (but not in the SI system!) the capacitance has the
dimensionality of length, i.e. is measured in centimeters. Note also that a fractional SI unit, 1 picofarad (10-12 F),
is very close to the Gaussian unit: 1 pF = [(110-12)/(4010-2)] cm 0.8998 cm. So, 1 pF is close to the
capacitance of a metallic ball with a 1-cm radius, making this unit very convenient for human-scale systems.
14 These arguments are somewhat insufficient to say which size should be used for a in the case of narrow,
extended conductors, e.g., a thin, long wire. Very soon we will see that in such cases the electrostatic energy, and
hence C, depends mostly on the larger size of the conductor.
Chapter 2 Page 7 of 68
Essential Graduate Physics EM: Classical Electrodynamics
where (1)(r) and (2)(r) are the potential distributions induced, respectively, by two electric charge
distributions, 1(r) and 2(r). In our current case, each of these integrals is limited to the volume (or,
more exactly, the surface) of the corresponding conductor, where each potential is constant and may be
taken out of the integral. As a result, Eq. (21) is reduced to
In terms of Eq. (19), (2)(r1) is just p12Q2, while (1)(r2) equals p21Q1. Plugging these expressions into Eq.
(22), and canceling the product Q1Q2, we arrive at Eq. (20).
Hence the 22 matrix of coefficients pjj’ (called the reciprocal capacitance matrix) is always
symmetric, and using the natural notation p11 p1, p22 p2, p12 = p21 p, we may rewrite it in a simpler
form:
p1 p . (2.23)
p p2
Plugging the relation (19), in this new notation, into Eq. (1.61), we see that the full electrostatic energy
of the system may be expressed as a quadratic form of its charges:
p1 2 p
U Q1 pQ1Q2 2 Q22 . (2.24)
2 2
It is evident that the middle term on the right-hand side of this equality describes the electrostatic
coupling of the conductors. (Without it, the energy would be just a sum of two independent electrostatic
energies of conductors 1 and 2.)15 Still, even with this simplification, Eqs. (19) and (20) show that in the
general case of arbitrary charges Q1 and Q2, the system of two conductors should be characterized by
three, rather than just one coefficient (“the capacitance”). This is why we may attribute a single
capacitance to the system only in some particular cases.
For practice, the most important of them is when the system as the whole is electrically neutral:
Q1 = –Q2 Q. In this case, the most important function of Q is the difference between the conductors’
potentials, called the voltage:16
Voltage:
V 1 2 , (2.25) definition
15 This is why systems with p << p1, p2 are called weakly coupled, and may be analyzed using approximate
methods – see, e.g., Fig. 4 and its discussion below.
16 A word of caution: in condensed matter physics and electrical engineering, voltage is most commonly defined
as the difference between electrochemical rather than electrostatic potentials. These two notions coincide if the
conductors have equal workfunctions – for example, if they are made of the same material. In this course, this
condition will be implied, and the difference between the two voltages ignored – to be discussed in detail in SM
Sec. 6.3.
Chapter 2 Page 8 of 68
Essential Graduate Physics EM: Classical Electrodynamics
the electrostatic energy of the system. Indeed, plugging Eqs. (19) and (20) into Eq. (24), we see that
both forms of Eq. (15) are reproduced if is replaced with V, Q1 with Q, and with C meaning the mutual
capacitance:
Capacitor’s Q2 C 2
energy U V . (2.27)
2C 2
The best-known system for which the mutual capacitance C may be readily calculated is the
plane (or “parallel-plate”) capacitor: a system of two conductors separated with a narrow plane gap of a
constant thickness d and an area A ~ a2 >> d2 – see Fig. 3.
Q
d a
Q Fig. 2.3. Plane capacitor
A – schematically.
a
Since the surface charges that contribute to the opposite charges Q of the conductors of this
system, attract each other, in the limit d << a they sit entirely on the opposite surfaces limiting the gap,
so there is virtually no electric field outside of the gap, while (according to the discussion in Sec. 1)
inside the gap it is normal to the surfaces. According to Eq. (3), the magnitude of this field is E = /0.
Integrating this field across thickness d of the narrow gap, we get V 1 – 2 = Ed = d/0, so = 0V/d.
However, due to the constancy of the potential of each electrode, V should not depend on the position in
the gap area. As a result, should be also constant over all the gap area A, regardless of the external
geometry of the conductors (see Fig. 3 again), and hence Q = A = 0V/d. Thus we may write V = Q/C,
with
A
C: Plane C 0 . (2.28)
capacitor d
Let me offer a few comments on this well-known formula. First, it is valid even if the gap is not
quite planar – for example, if it gently curves on a scale much larger than d, but retains its thickness.
Second, Eq. (28), which is valid only if A ~ a2 is much larger than d2, ignores the nonuniform electric
fields spreading to distances ~d beyond the gap edges. Such fringe fields result in an additional stray
capacitance C’ ~ 0a << C ~ 0a(a/d).17 Finally, the same condition (A >> d2) assures that C is much
larger than the self-capacitance Cj of each conductor – see Eq. (18).
The opportunities opened by the last fact for electronic engineering and experimental physics
practice are rather astonishing. For example, a very realistic 3-nm layer of high-quality aluminum oxide,
which may provide nearly perfect electric insulation between two thin conducting films, with an area of
0.1 m2 (a typical area of silicon wafers used in the semiconductor industry) provides C ~ 1 mF,18 larger
than the self-capacitance of the whole planet Earth!
17 The exact value of C’ depends on the shape of the conductors. In a rare case when it has been calculated
analytically, two thin round concentric disks of radius R, C’ = 0R [ln(16R/d) – 1].
18 Just as in Sec. 1, for the estimate to be realistic, I took into account the additional factor (for aluminum oxide,
close to 10) which should be included in the numerator of Eq. (28) to make it applicable to dielectrics – see
Chapter 3 below.
Chapter 2 Page 9 of 68
Essential Graduate Physics EM: Classical Electrodynamics
In a plane capacitor with d << a, the electrostatic coupling of the two conductors is evidently
very strong. As an opposite example of a weakly coupled system, let us consider two conducting spheres
of the same radius R, separated by a much larger distance d (Fig. 4).
R R
d R Fig. 2.4. A system of two far-separated,
similar conducting spheres.
In this case, the diagonal components of the matrix (23) may be approximately found from Eq.
(16), i.e. by neglecting the coupling altogether:
1 (2.29)
p1 p 2 .
4 0 R
Now, if we had just one sphere (say, number 1), the electric potential at distance d from its center would
be given by Eq. (16): = Q1/40d. If we move to this point a small (R << d) sphere without its own
charge, we may expect that its potential should not be too far from this result, so 2 Q1/40d.
Comparing this expression with the second of Eqs. (19) (taken for Q2 = 0), we get
1
p p 1, 2 . (2.30)
4 0 d
From here and Eq. (26), the mutual capacitance
1
C 2 0 R . (2.31)
p1 p 2
We see that (somewhat counter-intuitively), in this limit C does not depend substantially on the distance
between the spheres, i.e. does not describe their electrostatic coupling. The off-diagonal coefficients of
the reciprocal capacitance matrix (20) play this role much better – see Eq. (30).
Now let us consider the case when only one conductor of the two is charged, for example, Q1
Q, while Q2 = 0. Then Eqs. (19)-(20) yield
1 p 1Q1 . (2.32)
Now, we may follow Eq. (13) and define C1 1/p1 (and C2 1/p2), just to see that such partial
capacitances of the conductors of the system differ from its mutual capacitance C – cf. Eq. (26). For
example, in the case shown in Fig. 4, C1 = C2 40R 2C.
Finally, let us consider one more frequent case when one of the conductors carries a certain
charge (say, Q1 = Q), but the potential of its counterpart is sustained constant, say 2 = 0.19 (This
condition is especially easy to implement if the second conductor is much larger than the first one.
Indeed, as the estimate (18) shows, in this case, it would take a much larger charge Q2 to make the
potential 2 comparable with 1.) In this case the second of Eqs. (19), with the account of Eq. (20),
yields Q2 = – (p/p2)Q1. Plugging this relation into the first of those equations, we get
19 In electrical engineering, such a constant-potential conductor is called the ground. This term stems from the fact
that in many cases the electrostatic potential of the (weakly) conducting ground at the Earth’s surface is virtually
unaffected by laboratory-scale electric charges.
Chapter 2 Page 10 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1
p2 p2
Q1 C ,
ef
1 1 with C ef
1 p 1
. (2.33)
p 2 p1 p 2 - p 2
Thus, this effective capacitance of the first conductor is generally different from both its partial
capacitance C1 and the mutual capacitance C of the system, emphasizing again how accurate one should
be using the term “capacitance” without a qualifier.
Note also that none of these capacitances is equal to any element of the matrix reciprocal to the
matrix (23):
1
p1 p 1 p2 p
2 . (2.34)
p p 2 p p p
1 2
p p 1
Because of this reason, this physical capacitance matrix, which expresses the vector of conductor
charges via the vector of their potentials, is less convenient for most applications than the reciprocal
capacitance matrix (23). The same conclusion is valid for multi-conductor systems, which are most
conveniently characterized by an evident generalization of Eq. (19). Indeed, in this case, even the
mutual capacitance between two selected conductors may depend on the electrostatic conditions of other
components of the system.
Logically, at this point I would need to discuss the particular, but practically very important case
when the regions where the electric field between each pair of conductors is most significant do not
overlap – such as in the example shown in Fig. 5a. In this case, the system’s properties may be discussed
using the equivalent-circuit language, representing each such region as a lumped (localized) capacitor,
with a certain mutual capacitance C, and the whole system as some connection of these capacitors by
conducting “wires”, whose length and geometry are not important – see Fig. 5b.
(a) (b)
Q1 Q1
Q1 Q1 Fig. 2.5. (a) A simple system of
conductors, with three well-
Q2 Q3
Q2 C1 Q3 localized regions of high electric
field (and hence surface charge)
Q2 C2 C3 Q3 concentration, and (b) its
Q2 Q3 representation with an equivalent
circuit of three lumped capacitors.
Since the analysis of such equivalent circuits is covered in typical introductory physics courses, I
will save time by skipping their discussion. However, since such circuits are very frequently met in
physical experiment and electrical engineering practice, I would urge the reader to self-test their
understanding of this topic by solving a couple of problems offered at the end of this chapter,20 and if
their solution presents any difficulty, review the corresponding section in an undergraduate textbook.
20These problems have been selected to emphasize the fact that not every circuit may be reduced to the simplest
connections of the component capacitors and/or their groups in parallel and/or in series.
Chapter 2 Page 11 of 68
Essential Graduate Physics EM: Classical Electrodynamics
where Sk is the surface of the kth conductor of the system. After this boundary problem has been solved,
i.e. the spatial distribution (r) has been found at all points outside the conductors, it is straightforward
to use Eq. (3) to find the surface charge density, and finally the total charge
Qk d 2 r (2.36)
Sk
of each conductor, and hence any component of the reciprocal capacitance matrix. As an illustration, let
us implement this program for three very simple problems.
(i) Plane capacitor (Fig. 3). In this case, the easiest way to solve the Laplace equation is to use
the linear (Cartesian) coordinates with one axis (say, z) normal to the conductor surfaces – see Fig. 6.
z
d
Fig. 2.6. The plane capacitor as the system for the
simplest illustration of the boundary problem
0 (35) and its solution.
x
In these coordinates, the Laplace operator is just the sum of three second derivatives.21 It is
evident that due to the problem’s translational symmetry within the [x, y] plane, deep inside the gap (i.e.
at any lateral distance from the edges much larger than d) the electrostatic potential may only depend on
the coordinate normal to the gap surfaces: (r) = (z). For such a function, the derivatives over x and y
vanish, and the boundary problem (35) is reduced to a very simple ordinary differential equation
d 2
( z ) 0, (2.37)
dz 2
with boundary conditions
(0) 0, (d ) V . (2.38)
(For the sake of notation simplicity, I have used the discretion of adding a constant to the potential, to
make one of the potentials vanish, and also the definition (25) of the voltage V.) The general solution of
Eq. (37) is a linear function: (z) = c1z + c2, whose constant coefficients c1,2 may be readily found from
the boundary conditions (38). The final solution is
z
V . (2.39)
d
Chapter 2 Page 12 of 68
Essential Graduate Physics EM: Classical Electrodynamics
(ii) Coaxial-cable capacitor. Coaxial cable is a system of two round cylindrical, coaxial
conductors, with the cross-section shown in Fig. 7.
ba
0 a
Evidently, in this case, the cylindrical coordinates {, , z}, with the z-axis coinciding with the
common axis of the cylinders, are most convenient.22 Due to the axial symmetry of the problem, in these
coordinates E(r) = nE(), (r) = (), so in the general expression for the Laplace operator23 we may
take / = /z = 0. As a result, only the radial term of the operator survives, and the boundary
problem (35) takes the form
1 d d
0, (a ) V , (b) 0 . (2.42)
d d
The sequential double integration of this ordinary linear differential equation is elementary (and similar
to that of the Poisson equation in spherical coordinates, carried out in Sec. 1.3), giving
d d"
c1 , c1 c 2 c1 ln c 2 . (2.43)
d a
" a
The constants c1,2 may be found using boundary conditions (42):
22 I am sorry for using, for the 2D radius, the same letter as for the volumic density of charge. (Both notations
are too common to refuse.) I do not believe this may lead to confusion, because the letter will not be used in two
different meanings during any particular discussion.
23 See, e.g., MA Eq. (10.3).
Chapter 2 Page 13 of 68
Essential Graduate Physics EM: Classical Electrodynamics
b
c2 V , c1 ln c2 0 , (2.44)
a
giving c1 = –V/ln(b/a), so Eq. (43) takes the following form:
ln ( / a )
V 1 . (2.45)
ln (b / a )
Next, for our axial symmetry, the general expression for the gradient of a scalar function is
reduced to its radial derivative, so
d V
E . (2.46)
d lnb / a
This expression, plugged into Eq. (2), allows us to find the density of the conductors’ surface charge.
For example, for the inner electrode
0V
a 0 E a , (2.47)
a ln b / a
so its full charge (per unit length of the system) is
Q 2 0V
2 a a . (2.48)
l ln (b / a )
(It is straightforward to check that the charge of the outer electrode is equal and opposite.) Hence, by
the definition of the mutual capacitance, its value per unit length is
C Q 2 0 C: Coaxial
. (2.49) cable
l lV ln (b / a )
This expression shows that the total capacitance C is proportional to the systems length l (if l >>
a, b), while being only logarithmically dependent on is the dimensions of its cross-section. Since the
logarithm of a large argument is an extremely slow function (sometimes called a quasi-constant), if the
external conductor is made very large (b >> a), the capacitance diverges, but very weakly. Such
logarithmic divergence may be cut by any minuscule additional effect, for example by the finite length l
of the system. This fact yields the following very useful estimate of the self-capacitance of a single
round wire of radius a:
2 0 l
C , for l a . (2.50)
ln (l / a)
On the other hand, if the gap d between the conductors is very narrow: d b – a << a, then
ln(b/a) ln(1 + d/a) may be approximated as d/a, and Eq. (49) is reduced to C 20al/d, i.e. to Eq.
(28) for the plane capacitor, of the appropriate area A = 2al.
(iii) Spherical capacitor. This is a system of two conductors, with a central cross-section similar
to that of the coaxial cable (Fig. 7), but now with spherical rather than axial symmetry. This symmetry
implies that we may be better off using spherical coordinates, so the potential depends only on one of
them: the distance r from the common center of the conductors: (r) = (r). As we already know from
Sec. 1.3, in this case the general expression for the Laplace operator is reduced to its first (radial) term,
Chapter 2 Page 14 of 68
Essential Graduate Physics EM: Classical Electrodynamics
so the Laplace equation takes the simple form (1.47). Moreover, we have already found the general
solution of this equation – see Eq. (1.50):
c
(r ) 1 c 2 , (2.51)
r
Now acting exactly as above, i.e. determining the (only essential) constant c1 from the boundary
condition (a) – (b) =V, we get
1 1
1 1 V 1 1
c1 V , so that ( r ) c2 . (2.52)
a b r a b
Next, we can use the spherical symmetry to find the electric field, E(r) = nrE(r), with
1
d V 1 1
E (r ) , (2.53)
dr r 2 a b
and hence its values on conductors’ surfaces, and then the surface charge density from Eq. (3). For
example, for the inner conductor’s surface,
1
V 1 1
a 0 E (a) 0 2 , (2.54)
a a b
so, finally, for the full charge of that conductor, we get the following result:
1
1 1
Q 4 a 4 0 V .
2
(2.55)
a b
(Again, the charge of the outer conductor is equal and opposite.) Now we can use the definition (26) of
the mutual capacitance to get the final result:
1
C: Spherical Q 1 1 ab
capacitor C 4 0 4 0 . (2.56)
V a b ba
For b >> a, it coincides with Eq. (17) for the self-capacitance of the inner conductor. On the
other hand, if the gap d between two conductors is narrow, d b – a << a, then
a(a d ) a2
C 4 0 4 0 , (2.57)
d d
i.e. the capacitance approaches that of the planar capacitor of the area A = 4a2 – as it should.
All this seems (and indeed is) very straightforward, but let us contemplate what was the reason
for such easy successes. In each of the cases (i)-(iii) we have managed to find such coordinates that both
the Laplace equation and the boundary conditions involved only one of them. The necessary condition
for the former fact is for the coordinates to be orthogonal. This means that the three vector components
of the local differential dr, due to small variations of the new coordinates (say, dr, d, and d for the
spherical coordinates), are mutually perpendicular.
Chapter 2 Page 15 of 68
Essential Graduate Physics EM: Classical Electrodynamics
coordinate systems of this type. As an example, let us calculate the self-capacitance of a thin, round
conducting disk. The cylindrical or spherical coordinates would not give much help here, because while
they have the appropriate axial symmetry, they would make the boundary condition on the disk too
complicated: involving two coordinates, either and z, or r and . Help comes from noting that the flat
disk, i.e. the area with z = 0, r < R, may be viewed as the limiting case of an axially-symmetric ellipsoid
(or “degenerate ellipsoid”, or “ellipsoid of rotation”, or “spheroid”) – the surface formed by rotation of
the usual ellipse about one of its major axes – which would be also the symmetry axis of the disk – in
Fig. 8, the z-axis.
z
2 1
1 0
0
0 R x
–
Fig. 2.8. Solving the disk’s capacitance problem. (The
cross-section of the system by the vertical plane y = 0.)
Though this expression may look a bit intimidating, let us notice that since in our current
problem, the boundary conditions depend only on :25
24 For solution of some problems, it is convenient to use Eqs. (59) with – < < + and 0 /2.
25 I have called the disk’s potential V, to distinguish it from the potential at an arbitrary point of space.
Chapter 2 Page 16 of 68
Essential Graduate Physics EM: Classical Electrodynamics
0 V , 0, (2.61)
there is every reason to assume that the electrostatic potential in all space is a function of alone; in
other words, that all ellipsoids = const are the equipotential surfaces. Indeed, acting on such a function
() by the Laplace operator (60), we see that the two last terms in the square brackets vanish, and the
Laplace equation (35) is reduced to a simple ordinary differential equation
d d
d cosh d 0. (2.62)
with a singularity at the disk edge. Below we will see that such singularities are very typical for sharp
edges of conductors. Fortunately, in our current case the divergence is integrable, giving a finite disk
charge:
Chapter 2 Page 17 of 68
Essential Graduate Physics EM: Classical Electrodynamics
2d d
R R 1
4
Q d ( )2 d 0V 4 0VR 8 0 RV .
2
(2.68)
0 1
disk 0
0 (R )
2 2 1/ 2 1/ 2
surface
26 Let me remind the reader that the term cylindrical describes any surface formed by a translation, along a
straight line, of an arbitrary curve, and hence more general than the usual circular cylinder. (In this terminology,
for example, a prism is also a cylinder of a particular type, formed by a translation of a polygon.)
27 The complex variable z should not be confused with the (real) 3rd spatial coordinate z! We are considering 2D
Chapter 2 Page 18 of 68
Essential Graduate Physics EM: Classical Electrodynamics
and similarly for v. This means that the sum of second-order partial derivatives of each of the real
functions u(x, y) and v(x, y) is zero, i.e. that both functions obey the 2D Laplace equation. This
mathematical fact opens a nice way of solving problems of electrostatics for (relatively simple) 2D
geometries. Imagine that for a particular boundary problem we have found a function w(z) for that either
u(x, y) or v(x, y) is constant on all electrode surfaces. Then all lines of constant u (or v) represent
equipotential surfaces, i.e. the problem of the potential distribution has been essentially solved.
As a simple example, let us consider a problem important for practice: the quadrupole
electrostatic lens – a system of four cylindrical electrodes with hyperbolic cross-sections, whose
boundaries are described by the following relations:
a 2 , for the left and right electrodes,
x y
2 2
(2.74)
a , for the top and bottom electrodes,
2
V / 2
Fig. 2.9. (a) The quadrupole electrostatic lens’ cross-section and (b) its conformal mapping.
Comparing these relations with Eqs. (72), we see that each electrode surface corresponds to a
constant value of the real part u(x, y) of the function given by Eq. (71): u = a2. Moreover, the potentials
of both surfaces with u = +a2 are equal to +V/2, while those with u = –a2 are equal to –V/2. Hence we
may conjecture that the electrostatic potential at each point is a function of u alone; moreover, a simple
linear function,
c1u c 2 c1 ( x 2 y 2 ) c 2 , (2.75)
is a valid (and hence the unique) solution of our boundary problem. Indeed, it does satisfy the Laplace
equation, while the constants c1,2 may be readily selected in a way to satisfy all the boundary conditions
shown in Fig. 9a:
V x2 y2
. (2.76)
2 a2
so the boundary problem has been solved.
According to Eq. (76), all equipotential surfaces are hyperbolic cylinders, similar to those of the
electrode surfaces. What remains is to find the electric field at an arbitrary point inside the system:
x y
Ex V 2 , E y V 2 . (2.77)
x a y a
Chapter 2 Page 19 of 68
Essential Graduate Physics EM: Classical Electrodynamics
These formulas show, in particular, that if charged particles (e.g., electrons in an electron-optics system)
are launched to fly ballistically through such a lens, along the z-axis, they experience a force pushing
them toward the symmetry axis and proportional to the particle’s deviation from the axis (and thus
equivalent in action to an optical lens with a positive refraction power) in one direction, and a force
pushing them out (negative refractive power) in the perpendicular direction. One can show that letting
the particles fly through several such lenses, with alternating voltage polarities, in series, enables beam
focusing.30
Hence, we have reduced the 2D Laplace boundary problem to that of finding the proper analytic
function w(z). This task may be also understood as that of finding a conformal map, i.e. a
correspondence between components of any point pair, {x, y} and {u, v}, residing, respectively, on the
initial Cartesian plane z and the plane w of the new variables. For example, Eq. (71) maps the real
electrode configuration onto a plane capacitor of an infinite area (Fig. 9b), and the simplicity of Eq. (75)
is due to the fact that for the latter system the equipotential surfaces are just parallel planes u = const.
For more complex geometries, the suitable analytic function w(z) may be hard to find. However,
for conductors with piece-linear cross-section boundaries, substantial help may be obtained from the
following Schwarz-Christoffel integral
dz
w ( z) const . (2.78)
( z x1 ) ( z x 2 ) k 2 ...( z x N 1 ) k N 1
k1
that provides a conformal mapping of the interior of an arbitrary N-sided polygon onto the plane w = u
+ iv, onto the upper half (y > 0) of the plane z = x + iy. In Eq. (78), xj (j = 1, 2, N – 1) are the points of
the y = 0 axis (i.e., of the boundary of the mapped region on plane z) to which the corresponding
polygon vertices are mapped, while kj are the exterior angles at the polygon vertices, measured in the
units of , with –1 kj +1 – see Fig. 10.31 Of the points xj, two may be selected arbitrarily (because
their effects may be compensated by the multiplicative constant in Eq. (78), and the additive constant of
integration), while all the others have to be adjusted to provide the correct mapping.
plane w
y plane z
k3
w3
k 2 x1 x2 0 x3 x
w2
w1 k1 Fig. 2.10. The Schwartz-Christoffel mapping of
a polygon’s interior onto the upper half-plane.
30 See, e.g., textbook by P. Grivet, Electron Optics, 2nd ed., Pergamon, 1972.
31 The integral (78) includes only (N – 1) rather than N poles because a polygon’s shape is fully determined by (N
– 1) positions wj of its vertices and (N – 1) angles kj. In particular, since the algebraic sum of all external angles
of a polygon equals 2, the last angle parameter kj = kN is uniquely determined by the set of the previous ones.
Chapter 2 Page 20 of 68
Essential Graduate Physics EM: Classical Electrodynamics
In the general case, the complex integral (78) may be hard to tackle. However, in some important
cases, in particular those with right angles (kj = ½) and/or with some points wj at infinity, the integrals
may be readily worked out, giving explicit analytical expressions for the mapping functions w(z). For
example, let us consider a semi-infinite strip defined by restrictions –1 u +1 and 0 v, on the w-
plane – see the left panel of Fig. 11.
plane w plane z
v y
w 3 i
The strip may be considered as a triangle, with one vertex at the infinitely distant vertical point
w3 = 0 + i . Let us map the polygon onto the upper half of plane z, shown on the right panel of Fig. 11,
with the vertex w1 = –1 + i 0 mapped onto the point z1 = –1 + i 0, and the vertex w2 = +1 + i 0 mapped
onto the point z2 = +1 + i 0. Since the external angles at these vertices are equal to +/2, and hence k1 =
k2 = +½, Eq. (78) is reduced to
dz dz dz
w ( z ) const const 2 const i . (2.79)
( z 1) ( z 1)
1/ 2 1/ 2
( z 1) 1/ 2
(1 z 2 )1 / 2
This complex integral may be worked out, just as for real z, with the substitution z = sin, giving
sin -1 z
(2.80)
w( z ) const' d c sin z c2 .
-1
1
Determining the constants c1,2 from the required mapping, i.e. from the conditions w(-1 + i 0) = –1 + i 0
and w(+1+ i 0)= +1+ i 0 (see the arrows in Fig. 11), we finally get32
2 πw
w (z ) sin 1 z , i.e. z sin . (2.81a)
2
Using the well-known expression for the sine of a complex argument,33 we may rewrite this elegant
result in either of the following two forms for the real and imaginary components of z and w:
u
2
sin 1 2x
, v
2
cosh -1 ( x 1) 2
y2
1/ 2
( x 1) 2 y 2 1/ 2
,
( x 1) 2
y2
1/ 2
( x 1) 2 y 2 1/ 2
2
32 Note that this function differs only by a linear transformation of variables from the function z = c cosh w, which
is the canonical form of the definition of the so-called elliptic (not ellipsoidal!) orthogonal coordinates.
33 See, e.g., MA Eq. (3.5).
Chapter 2 Page 21 of 68
Essential Graduate Physics EM: Classical Electrodynamics
u v u v
x sin cosh , y cos sinh . (2.81b)
2 2 2 2
It is amazing how perfectly the last formula manages to keep y 0 at the different borders of our w-
region (Fig. 11): at its side borders (u = 1, 0 v < ), this is performed by the first multiplier, while at
the bottom border (–1 u +1, v = 0), the equality is enforced by the second multiplier.
This mapping may be used to solve several electrostatics problems with the geometry shown in
Fig. 11a; probably the most surprising of them is the following one. A straight gap of width 2t is cut in a
very thin conducting plane, and voltage V is applied between the resulting half-planes – see the bold
straight lines in Fig. 12.
y
V / 2 V / 2
t t x Fig. 2.12. The equipotential surfaces of
the electric field between two thin
conducting semi-planes (or rather their
cross-sections by the plane z = const).
Selecting a Cartesian coordinate system with the z-axis directed along the cut, the y-axis normal
to the plane, and the origin in the middle of the cut (Fig. 12), we can write the boundary conditions of
this Laplace problem as
V / 2, for x t , y 0, (2.82)
V / 2, for x t , y 0.
(Due to the problem’s symmetry, we may expect that in the middle of the gap, i.e. at –t < x < +t and y =
0, the electric field is parallel to the plane and hence /y = 0.) The comparison of Figs. 11 and 12
shows that if we normalize our coordinates {x, y} to t, Eqs. (81) provide the conformal mapping of our
system onto the plane z to a plane capacitor on the plane w, with the voltage V between two conducting
planes located at u = 1. Since we already know that in that case = (V/2)u, we may immediately use
the first of Eqs. (81b) to write the final solution of the problem:34
V V 2x
u sin 1 . (2.83)
2
(x t)2 y 2
1/ 2
(x t) 2 y 2
1/ 2
The thin lines in Fig. 12 show the corresponding equipotential surfaces;35 it is evident that the
electric field concentrates at the gap edges, just as it did at the edge of the thin disk (Fig. 8). Let me
34This result may be also obtained by the Green’s function method, to be discussed in Sec. 10 below.
35Another graphical representation of the electric field distribution, by field lines, is less convenient. (It is more
useful for the magnetic field, which may be represented by a scalar potential only in particular cases, so there is
no surprise that the field lines were introduced only by Michael Faraday in the 1830s.) As a reminder, the field
Chapter 2 Page 22 of 68
Essential Graduate Physics EM: Classical Electrodynamics
leave the remaining calculation of the surface charge distribution and the mutual capacitance between
the half-planes (per unit length of the system in the z-direction) for the reader’s exercise.
ck k , (2.84)
k
where each function k satisfies the Laplace equation, and then select the set of coefficients ck to satisfy
the boundary conditions. More specifically, in the variable separation method, the partial solutions k
are looked for in the form of a product of functions, each depending on just one spatial coordinate.
Let us discuss this approach on the classical example of a rectangular box with conducting walls
(Fig. 13), with the same potential (that I will take for zero) at all its sidewalls and the lower lid, but a
different potential V at the top lid (z = c). Moreover, to demonstrate the power of the variable separation
method, let us carry out all the calculations for a more general case when the top lid’s potential is an
arbitrary 2D function V(x, y).37
z
c
V ( x, y )
For this geometry, it is natural to use the Cartesian coordinates {x, y, z}, representing each of the
partial solutions in Eq. (84) as the following product
line is the curve to which the field vectors are tangential at each point. Hence the electric field lines are always
normal to the equipotential surfaces, so it is always straightforward to sketch them, if desirable, from the
equipotential surface pattern – like the one shown in Fig. 12.
36 This method was already discussed in CM Sec. 6.5 and then used also in Secs. 6.6 and 8.4 of that course.
However, it is so important that I need to repeat its discussion in this part of my series, for the benefit of the
readers who have skipped the Classical Mechanics course for whatever reason.
37 Such voltage distributions may be implemented in practice, for example, using the so-called mosaic electrodes
consisting of many electrically-insulated and individually-biased panels.
Chapter 2 Page 23 of 68
Essential Graduate Physics EM: Classical Electrodynamics
k X ( x)Y ( y ) Z ( z ) . (2.85)
Plugging it into the Laplace equation expressed in the Cartesian coordinates,
2 k 2 k 2 k
0, (2.86)
x 2 y 2 z 2
and dividing the result by XYZ, we get
1 d 2 X 1 d 2Y 1 d 2 Z
0. (2.87)
X dx 2 Y dy 2 Z dz 2
Here comes the punch line of the variable separation method: since the first term of this sum may
depend only on x, the second one only of y, etc., Eq. (87) may be satisfied everywhere in the volume
only if each of these terms equals a constant. In a minute we will see that for our current problem (Fig.
13), these constant x- and y-terms have to be negative; hence let us denote these variable separation
constants as (-2) and (-2), respectively. Now Eq. (87) shows that the constant z-term has to be
positive; denoting it as 2 we get the following relation:
2 2 2 . (2.88)
Now the variables are separated in the sense that for the functions X(x), Y(y), and Z(z) we got separate
ordinary differential equations,
d2X d 2Y d 2Z
2 X 0, 2Y 0, 2 Z 0, (2.89)
dx 2 dy 2 dz 2
which are related only by Eq. (88) for their constant parameters.
Let us start with the equation for X(x). Its general solution is the sum of functions sinx and
cosx, multiplied by arbitrary coefficients. Let us select these coefficients to satisfy our boundary
conditions. First, since X should vanish at the back vertical wall of the box (i.e., with the coordinate
origin choice shown in Fig. 13, at x = 0 for any y and z), the coefficient at cosx should be zero. The
remaining coefficient (at sinx) may be included in the general factor ck in Eq. (84), so we may take X
in the form
X sin x . (2.90)
This solution satisfies the boundary condition at the opposite wall (x = a) only if the product a is a
multiple of , i.e. if is equal to any of the following numbers (commonly called eigenvalues):38
n n, with n 1, 2,... (2.91)
a
(Terms with negative values of n would not be linearly-independent from those with positive n and may
be dropped from the sum (84). The value n = 0 is formally possible but would give X = 0, i.e. k = 0, at
38 Note that according to Eqs. (91)-(92), as the spatial dimensions a and b of the system are increased, the
distances between the adjacent eigenvalues tend to zero. This fact implies that for spatially infinite systems, the
eigenvalue spectra are continuous, so the sums of the type (84) become integrals; however, the general approach
remains the same. A few problems of this type are provided in Sec. 9 for the reader’s exercise.
Chapter 2 Page 24 of 68
Essential Graduate Physics EM: Classical Electrodynamics
any x, i.e. no contribution to sum (84), so it may be dropped as well.) Now we see that we indeed had to
take real, i.e. 2 positive – otherwise, instead of the oscillating function (90), we would have a sum of
two exponential functions, which cannot equal zero at two independent points of the x-axis.
Since the equation (89) for function Y(y) is similar to that for X(x), and the boundary conditions
on the walls perpendicular to axis y (y = 0 and y = b) are similar to those for x-walls, the absolutely
similar reasoning gives
Y sin y, m m, with m 1, 2,... , (2.92)
b
where the integer m may be selected independently of n. Now we see that according to Eq. (88), the
separation constant depends on two integers n and m, so the relationship may be rewritten as
1/ 2
n 2 m 2
nm 2
n m
2 1/ 2
. (2.93)
a b
The corresponding solution of the differential equation for Z may be represented as a linear
combination of two exponents exp{nmz}, or alternatively of two hyperbolic functions, sinhnmz and
coshnmz, with arbitrary coefficients. At our choice of coordinate origin, the latter option is preferable
because coshnmz cannot satisfy the zero boundary condition at the bottom lid of the box (z = 0). Hence
we may take Z in the form
Z sinh nm z , (2.94)
which automatically satisfies that condition.
Now it is the right time to merge Eqs. (84)-(85) and (90)-(94), replacing the temporary index k
with the full set of possible eigenvalues, in our current case of two integer indices n and m:
Variable
nx my
separation
in Cartesian
( x, y , z ) c
n , m 1
nm sin
a
sin
b
sinh nm z , (2.95)
coordinates
(example)
where nm is given by Eq. (93). This solution satisfies not only the Laplace equation but also the
boundary conditions on all walls of the box, besides the top lid, for arbitrary coefficients cnm. The only
job left is to choose these coefficients from the top-lid requirement:
nx my
( x, y , c ) c
n , m 1
nm sin
a
sin
b
sinh nm c V ( x, y ) . (2.96)
It may look bad to have just one equation for the infinite set of coefficients cnm. However, the decisive
help comes from the fact that the functions of x and y that participate in Eq. (96), form full, orthogonal
sets of 1D functions. The last term means that the integrals of the products of the functions with
different integer indices over the region of interest equal zero. Indeed, direct integration gives
a
nx n'x a
sin
0
a
sin
a
dx
2
nn' , (2.97)
where nn’ is the Kronecker symbol, and similarly for y (with the evident replacements a b, and n
m). Hence, a fruitful way to proceed is to multiply both sides of Eq. (96) by the product of the basis
functions, with arbitrary indices n’ and m’, and integrate the result over x and y:
Chapter 2 Page 25 of 68
Essential Graduate Physics EM: Classical Electrodynamics
a
nx n'x b
my m'y a b
n'x m'y
cnm sinh nm c sin
n , m 1 a
sin
a
dx sin
b
sin
b
dy dx dy V ( x, y ) sin
a
sin
b
. (2.98)
0 0 0 0
Due to Eq. (97), all terms on the left-hand side of the last equation, besides those with n = n’ and m =
m’, vanish, and (replacing n’ with n, and m’ with m, for notation brevity) we finally get
4
a b
nx my
c nm dx dy V ( x, y ) sin
ab sinh nm c 0 0 a
sin
b
. (2.99)
The relations (93), (95), and (99) give the complete solution of the posed boundary problem; we
can see both good and bad news here. The first bit of bad news is that in the general case, we still need
to work out the integrals (99) – formally, the infinite number of them. In some cases, it is possible to do
this analytically, in one shot. For example, if the top lid in our problem is a single conductor, i.e. has a
constant potential V0, we may take V(x,y) = V0 = const, and both 1D integrations are elementary; for
example
a 2 , for n odd,
n
a
nx a
0 a
sin dx
n 0
sin d
n 0, for n even,
(2.100)
Chapter 2 Page 26 of 68
Essential Graduate Physics EM: Classical Electrodynamics
which means that each elementary segment of the flat box behaves just as a plane capacitor. Only near
the sidewalls, the higher terms in the series (95) are important, producing some deviations from Eq.
(102). (For the general problem with an arbitrary function V(x,y), this is also true in all regions where
this function changes sharply.)
y b/2 y b/2
z / c 0.95
0.8 0.8
0.6 0.6
( x, y, z )
0.75
V0
0.4 0.4
x
0.5
a 0.2
0.1
0.2 0.5 0.2 0.05
0.25
0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
x/a z/c
Fig. 2.14. The electrostatic potential’s distribution inside a cubic box (a = b = c) with a constant voltage V0
on the top lid (Fig. 13), calculated numerically from Eqs. (93), (95), and (101). The dashed line on the left
panel shows the contribution of the main term of the series (with n = m = 1) to the full result, for z/c = 0.5.
In the opposite limit (a, b << c), Eq. (93) shows that on the contrary, n,mc >> 1 for all n and m.
Moreover, the ratio sinhn,mz/sinhn,mc drops sharply if either n or m is increased, provided that z is not
too close to c. Hence in this case a very good approximation may be obtained by keeping just the
leading term, with n = m = 1, in Eq. (95), so the challenge of summation disappears. (As was discussed
above, this approximation works reasonably well even for a cubic box.) In particular, for the constant
potential of the upper lid, we can use Eq. (101) and the exponential asymptotic for both sinh functions,
to get a very simple formula:
16 x y
2 sin sin exp
a2 b2
1/ 2
(c z ) . (2.103)
a b ab
These results may be readily generalized to some other problems. For example, if all walls of the
box shown in Fig. 13 have an arbitrary potential distribution, we may use the linear superposition
principle to represent the electrostatic potential distribution as the sum of six partial solutions of the type
of Eq. (95), each with one wall biased by the corresponding voltage, and all other grounded ( = 0).
To summarize, the results given by the variable separation method in the Cartesian coordinates
are closer to what we could call a genuinely analytical solution than to a purely numerical solution.
Chapter 2 Page 27 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Now, let us explore the issues that arise when this method is applied in other orthogonal coordinate
systems.
In a full analogy with Eq. (85), let us represent each particular solution k as a product R()F().
Plugging this expression into Eq. (104) and then dividing all its parts by RF /2, we get
d dR 1 d 2 F
0. (2.105)
R d d F d 2
Following the same reasoning as for the Cartesian coordinates, we get two separated ordinary
differential equations
d dR
2R , (2.106)
d d
d 2F
2 F 0, (2.107)
d 2
with any constant coefficients a and b, is also a solution of Eq. (106). Moreover, the general theory of
linear ordinary differential equations tells us that the solution of a second-order equation like Eq. (106)
may only depend on just two constant factors that scale two linearly independent functions. Hence, for
all values 2 0, Eq. (108) presents the general solution of that equation. The case when = 0, in which
the functions + and – are just constants and hence are not linearly independent, is special, but in
this case, the integration of Eq. (106) is straightforward,40 giving
Chapter 2 Page 28 of 68
Essential Graduate Physics EM: Classical Electrodynamics
R a 0 b0 ln , for 0. (2.109)
In order to specify the separation constant, let us explore Eq. (107), whose general solution is
There are two possible cases here. In many boundary problems solvable in cylindrical coordinates, the
free-space region, in which the Laplace equation is valid, extends continuously around the origin point
= 0. In this region, the potential has to be continuous and uniquely defined, so F has to be a 2-periodic
function of . For that, one needs the product ( +2) to equal + 2n, with n being an integer,
immediately giving us a discrete spectrum of possible values of the variable separation constant:
n 0, 1, 2,... (2.111)
In this case, both functions R and F may be labeled with the integer index n. Taking into account that
the terms with negative values of n may be summed up with those with positive n, and that s0 has to
equal zero (otherwise the 2-periodicity of function F would be violated), we see that the general
solution of the 2D Laplace equation for such geometries may be represented as
Variable
bn
c n cos n s n sin n .
separation
( , ) a 0 b0 ln a n n (2.112)
n
in polar
coordinates n 1
Let us see how all this machinery works on the famous problem of a round cylindrical conductor
placed into an electric field that is uniform and perpendicular to the cylinder’s axis at large distances
(see Fig. 15a), as if it is created by a large plane capacitor. First of all, let us explore the effect of the
system’s symmetries on the coefficients in Eq. (112). Selecting the coordinate system as shown in Fig.
15a, and taking the cylinder’s potential for zero, we immediately get a0 = 0.
(a) (b)
E0 y
R
0 x
Fig. 2.15. A conducting cylinder inserted into an initially uniform electric field perpendicular to its
axis: (a) the problem’s geometry, and (b) the equipotential surfaces given by Eq. (117).
Chapter 2 Page 29 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Moreover, due to the mirror symmetry about the plane [x, z], the solution has to be an even
function of the angle , and hence all coefficients sn should also equal zero. Also, at large distances (
>> R) from the cylinder, its effect on the electric field should vanish, and the potential should approach
that of the uniform external field E = E0nx:
This is only possible if in Eq. (112), b0 = 0, and also all coefficients an with n 1 vanish, while the
product a1c1 should be equal to (–E0). Thus the solution is reduced to the following form
Bn
( , ) E 0 cos cos n , (2.114)
n 1 n
in which the coefficients Bn bncn should be found from the boundary condition at = R:
( R, ) 0 . (2.115)
This requirement yields the following equation,
B B
E 0 R 1 cos nn cos n 0, (2.116)
R n 2 R
which should be satisfied for all . This equality, read backward, may be considered as an expansion of
a function identically equal to zero into a series over mutually orthogonal functions cosn. It is
evidently valid if all coefficients of the expansion, including (–E0R + B1/R), and all Bn for n 2 are
equal to zero. Moreover, mathematics tells us that such expansions are unique, so this is the only
possible solution of Eq. (116). So, B1 = E0R2, and our final answer (valid only outside of the cylinder,
i.e. for R), is
R2 R2
( , ) E0 cos E 0 1 2 x .
2
(2.117)
x y
This result, which may be graphically represented with the equipotential surfaces shown in Fig.
15b, shows a smooth transition between the uniform field (113) far from the cylinder, to the
equipotential surface of the cylinder (with = 0). Such smoothening is very typical for Laplace
equation solutions. Indeed, as we know from Chapter 1, these solutions correspond to the lowest integral
of the potential gradient’s square, i.e. to the lowest potential energy (1.65) possible at the given
boundary conditions.
To complete the problem, let us use Eq. (3) to calculate the distribution of the surface charge
density over the cylinder’s cross-section:
R2
0 En 0 0 E 0 cos 2 0 E 0 cos . (2.118)
R
R
surface
This very simple formula shows that with the field direction shown in Fig. 15a (E0 > 0), the surface
charge is positive on the right-hand side of the cylinder and negative on its left-hand side, thus creating a
field directed from the right to the left, which exactly compensates the external field inside the
conductor, where the net field is zero. (Please take one more look at the schematic Fig. 1a.) Note also
that the net electric charge of the cylinder is zero, in correspondence with the problem symmetry.
Chapter 2 Page 30 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Another useful by-product of the calculation (118) is that the surface electric field equals 2E0cos, and
hence its largest magnitude is twice the field far from the cylinder. Such electric field concentration is
very typical for all convex conducting surfaces.
The last observation gets additional confirmation from the second possible topology when Eq.
(110) is used to describe problems with no angular periodicity. A typical example of this situation is a
cylindrical conductor with a cross-section that features a corner limited by two straight-line segments
(Fig. 16). Indeed, we may argue that at < R (where R is the radial extension of the planar sides of the
corner, see Fig. 16), the Laplace equation may be satisfied by a sum of partial solutions R()F(), if the
angular components of the products satisfy the boundary conditions on the corner sides. Taking (just for
the simplicity of notation) the conductor’s potential to be zero, and one of the corner’s sides as the x-
axis ( = 0), these boundary conditions are
F (0) F ( ) 0 , (2.119)
where the angle may be anywhere between 0 and 2 – see Fig. 16.
(a) (b)
R
R 0
Fig. 2.16. The cross-sections
0 of cylindrical conductors with
(a) a corner and (b) a wedge.
Comparing this condition with Eq. (110), we see that it requires s0 and all c to vanish, and to
take one of the values of the following discrete spectrum:
m m, with m 1, 2,... . (2.120)
Hence the full solution of the Laplace equation for this geometry takes the form
m
a m m / sin , for R, 0 , (2.121)
m 1
where the constants s have been incorporated into am. The set of coefficients am cannot be universally
determined, because it depends on the exact shape of the conductor outside the corner, and the
externally applied electric field. However, whatever the set is, in the limit 0, the solution (121) is
almost41 always dominated by the term with the lowest m = 1:
a1 / sin , (2.122)
because the higher terms tend to zero faster. This potential distribution corresponds to the surface charge
density
41
Exceptions are possible only for highly symmetric configurations when the external field is specially crafted to
make a1 = 0. In this case, the solution at 0 is dominated by the first nonzero term of the series (121).
Chapter 2 Page 31 of 68
Essential Graduate Physics EM: Classical Electrodynamics
a / 1
0 En 0 const , 0 0 1 . (2.123)
( )
surface
(It is similar, with the opposite sign, on the opposite face of the angle.)
The result (123) shows that if we are dealing with a concave corner ( < , see Fig. 16a), the
charge density (and the surface electric field) tends to zero. On the other hand, at a “convex corner” with
> (actually, a wedge – see Fig. 16b), both the charge and the field’s strength concentrate, formally
diverging at 0. (So, do not sit on a roof’s ridge during a thunderstorm; rather hide in a ditch!) We
have already seen qualitatively similar effects for the thin round disk and the split plane.
z
V ( , )
l
Following the main idea of the variable separation method, let us require that each partial
function k in Eq. (84) satisfies the Laplace equation, now in the full cylindrical coordinates {, , z}:42
1 k 1 2 k 2 k
2 0. (2.124)
2
z 2
Plugging k in the form of the product R()F()Z(z) into Eq. (124) and then dividing all resulting terms
by this product, we get
1 d dR 1 d 2F 1 d 2Z
2 0. (2.125)
R d d F d 2 Z dz 2
Since the first two terms of Eq. (125) can only depend on the polar variables and , while the third
term, only on z, at least that term should equal a constant. Denoting it (just like we did in the
rectangular box problem) by 2, we get the following set of two equations:
Chapter 2 Page 32 of 68
Essential Graduate Physics EM: Classical Electrodynamics
d 2Z
2
2Z , (2.126)
dz
1 d dR 1 d 2F
2 2 0. (2.127)
R d d F d 2
Now, multiplying all the terms of Eq. (127) by 2, we see that the last term of the result, (d2F/d2)/F,
may depend only on , and thus should equal a constant. Calling that constant 2 (just as in Sec. 6
above), we separate Eq. (127) into an angular equation,
d 2F
2F 0 , (2.128)
d 2
enabling us to limit our discussion to the functions with n 0. Figure 18 shows four of these functions
with the lowest positive n.
43 Note that this normalization is specific for each value of the variable separation parameter . Also, please notice
that the normalization is meaningless for = 0, i.e. for the case Z(z) = const. However, if we need partial
solutions for this particular value of , we can always use Eqs. (108)-(109).
44 For a more complete discussion of these functions, see the literature listed in MA Sec. 16, for example, Chapter
6 (written by F. Olver) in the famous collection compiled and edited by Abramowitz and Stegun, available online.
Chapter 2 Page 33 of 68
Essential Graduate Physics EM: Classical Electrodynamics
0.5
J n ( )
n 0 1 2 3
0.5
which is formally valid for any , and may even serve as an alternative definition of the functions Jn().
However, the series is converging fast only at small arguments, << n, where its leading term is
n
1
J n ( ) 0 . (2.133)
n! 2
At n + 1.86n1/3, the Bessel function reaches its maximum45
0.675
max J n ( ) , (2.134)
n1 / 3
and then starts to oscillate with a period gradually approaching 2, a phase shift that increases by /2
with each unit increment of n, and an amplitude that decreases as –1/2. All these features are described
by the following asymptotic formula:
1/ 2
2 n
J n ( ) cos , (2.135)
4 2
which starts to give a reasonable approximation soon after the function peaks – see Fig. 18.46
45 These two approximations for the Bessel function peak are strictly valid for n >> 1, but may be used for
reasonable estimates starting already from n = 1. For example, max [J1()] is close to 0.58 and is reached at
2.4, just about 30% away from the values given by the asymptotic formulas.
Chapter 2 Page 34 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Now we are ready for our case study (Fig. 17). Since the functions the Z(z) have to satisfy not
only Eq. (126) but also the bottom-lid boundary condition Z(0) = 0, they are proportional to sinhz – cf.
Eq. (94). Then Eq. (84) becomes
J n c n cos n s n sin n sinh z . (2.136)
n 0
Next, we need to satisfy the zero boundary condition at the cylinder’s side wall ( = R). This may be
ensured by taking
J n (R) 0 . (2.137)
Since each function Jn(x) has an infinite number of positive zeros (see Fig. 18 again), which may be
numbered by an integer index m = 1, 2, …, Eq. (137) may be satisfied with an infinite number of
discrete values of the parameter :
nm
nm , (2.138)
R
where nm is the m-th zero of the function Jn(x) – see the top numbers in the cells of Table 1. (Very soon
we will see what we need the bottom numbers for.)
Table 2.1. Approximate values of a few first zeros, nm, of a few lowest-order Bessel functions Jn()
(the top number in each cell), and the values of dJn()/d at these points (the bottom number).
m=1 2 3 4 5 6
2.40482 5.52008 8.65372 11.79215 14.93091 18.07106
n=0
-0.51914 +0.34026 -0.27145 +0.23245 -0.20654 +0.18773
3.83171 7.01559 10.17347 13.32369 16.47063 19.61586
1
-0.40276 +0.30012 -0.24970 +0.21836 -0.19647 +0.18006
5.13562 8.41724 11.61984 14.79595 17.95982 21.11700
2
-0.33967 +0.27138 -0.23244 +0.20654 -0.18773 +0.17326
6.38016 9.76102 13.01520 16.22347 19.40942 22.58273
3
-0.29827 +0.24942 -0.21828 +0.19644 -0.18005 +0.16718
7.58834 11.06471 14.37254 17.61597 20.82693 24.01902
4
-0.26836 +0.23188 -0.20636 +0.18766 -0.17323 +0.16168
8.77148 12.33860 15.70017 18.98013 22.21780 25.43034
5
-0.24543 +0.21743 -0.19615 +0.17993 -0.16712 +0.15669
46 Eq. (135) and Fig. 18 clearly show the close analogy between the Bessel functions and the usual trigonometric
functions, sine and cosine. To emphasize this similarity, and help the reader to develop more gut feeling of the
Bessel functions, let me mention one result of the elasticity theory: while the sinusoidal functions describe, in
particular, transverse standing waves on a guitar string, the functions Jn() describe, in particular, transverse
standing waves on an elastic round membrane (say, a round drum), with J0() describing their lowest
(fundamental) mode – the only mode with a nonzero amplitude of the membrane center’s oscillations.
Chapter 2 Page 35 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Here the coefficients cnm and snm have to be selected to satisfy the only remaining boundary condition –
that on the top lid:
l
( , , l ) J n nm c nm cos n s nm sin n sinh nm V ( , ) . (2.140)
n 0 m 1 R R
To use it, let us multiply both sides of Eq. (140) by the product Jn(nm’/R) cos n’, integrate the result
over the lid area, and use the following property of the Bessel functions:
1
1
J s J s sds 2 J ( nm ) mm ' .
2
n nm n nm ' n 1 (2.141)
0
As a small but important detour, the last relation expresses a very specific (“2D”) orthogonality
of the Bessel functions with different indices m – do not confuse them with the function order indices n,
please!47 Since it relates two Bessel functions of the same order n, it is natural to ask why its right-hand
side contains the function with a different order (n + 1). Some gut feeling of that may come from one
more very important property of the Bessel functions, the so-called recurrence relations:48
2nJ n ( )
J n 1 ( ) J n 1 ( ) , (2.142a)
dJ n ( )
J n 1 ( ) J n 1 ( ) 2 , (2.142b)
d
which in particular yield the following formula (convenient for working out some Bessel function
integrals):
d n
d
J n ( ) n J n 1 ( ) . (2.143)
Let us apply the recurrence relations at the special points nm. At these points, Jn vanishes, and the
system of two equations (142) may be readily solved to get, in particular,
dJ n
J n 1 nm nm , (2.144)
d
so the square bracket on the right-hand side of Eq. (141) is just (dJn/d)2 at = nm. Thus the values of
the Bessel function derivatives at the zero points of the function, given by the lower numbers in the cells
of Table 1, are as important for boundary problem solutions as the zeros themselves.
Now returning to our problem: since the angular functions cos n are also orthogonal – both to
each other,
47 The Bessel functions of the same argument but different orders are also orthogonal, but differently:
d 1
J
0
n ( ) J n' ( )
n n'
nn' .
48 These relations provide, in particular, a convenient way for numerical computation of all Jn() – after J0() has
been computed. (The latter task is usually performed using Eq. (132) for smaller and an extension of Eq. (135)
for larger .) Note that most mathematical software packages, including all those listed in MA Sec. 16(iv), include
ready subroutines for calculation of the functions Jn() and other special functions used in this lecture series. In
this sense, the conditional line separating these “special functions” from “elementary functions” is rather fine.
Chapter 2 Page 36 of 68
Essential Graduate Physics EM: Classical Electrodynamics
2
cos(n ) cos(n' ) d
0
nn ' , (2.145)
and to all functions sinn, the integration over the lid area kills all terms of both series in Eq. (140),
besides just one term proportional to cn’m’, and hence gives an explicit expression for that coefficient.
The counterpart coefficients sn’m’ may be found by repeating the same procedure with the replacement of
cos n’ by sin n’. This evaluation (left for the reader’s exercise) completes the solution of our problem
for an arbitrary lid potential V(,).
Still, before leaving the Bessel functions behind (for a while only :-), let me address two
important issues. First, we have seen that in our cylinder problem (Fig. 17), the set of functions
Jn(nm/R) with different indices m (which characterize the degree of Bessel function’s stretch along
axis ) play a role similar to that of functions sin(nx/a) in the rectangular box problem shown in Fig.
13. In this context, what is the analog of functions cos(nx/a) – which may be important for some
boundary problems? In a more formal language, are there any functions of the same argument
nm/R, that would be linearly independent of the Bessel functions of the first kind, while satisfying the
same Bessel equation (130)?
The answer is yes. For the definition of such functions, we first need to generalize our prior
formulas for Jn(), and in particular Eq. (132), to the case of arbitrary, not necessarily real order .
Mathematics says that the generalization may be performed in the following way:
2k
(1) k
J ( )
2
,
k 0 k! ( k 1) 2
(2.146)
where (s) is the so-called gamma function that may be defined as49
( s ) s 1e d . (2.147)
0
The simplest, and the most important property of the gamma function is that for integer values of its
argument, it gives the factorial of the number smaller by one:
Chapter 2 Page 37 of 68
Essential Graduate Physics EM: Classical Electrodynamics
called the Bessel functions of the second kind, or more often the Weber functions,50 and then to follow
the limit n. At this, both the numerator and denominator of the right-hand side of Eq. (149) tend to
zero, but their ratio tends to a finite value called Yn(x). It may be shown that the resulting functions are
still the solutions of the Bessel equation and are linearly independent of Jn(x), though are related just as
those functions if the sign of n changes:
Y n ( ) (1) n Yn ( ) . (2.150)
n 0 1 2 3
0.5
Yn ( )
0
0.5
2
ln 2 , for n 0,
Yn ( ) n (2.152)
( n 1)!
, for n 0,
2
where is the so-called Euler constant, defined as follows:
50 Sometimes, they are called the Neumann functions and denoted as N().
Chapter 2 Page 38 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1 1 1
lim n 1 ... ln n 0.577157 (2.153)
2 3 n
As Eqs. (152) and Fig. 19 show, the functions Yn( ) diverge at 0 and hence cannot describe the
behavior of any physical variable, in particular the electrostatic potential.
One may wonder: if this is true, when do we need these functions in physics? Figure 20 shows an
example of a simple boundary problem of electrostatics, whose solution by the variable separation
method involves both functions Jn( ) and Yn( ).
(a) (b)
Here two round, conducting coaxial cylindrical tubes are kept at the same (say, zero) potential,
but at least one of two lids has a different potential. The problem is almost completely similar to that
discussed above (Fig. 17), but now we need to find the potential distribution in the free space between
the tubes, i.e. for R1 < < R2. If we use the same variable separation as in the simpler counterpart
problem, we need the radial functions R() to satisfy two zero boundary conditions: at = R1 and =
R2. With the Bessel functions of just the first kind, Jn(), it is impossible to do, because the two
boundaries would impose two independent (and generally incompatible) conditions, Jn(R1) = 0, and
Jn(R2) = 0, on one “stretching parameter” . The existence of the Bessel functions of the second kind
immediately saves the day, because if the radial function solution is represented as a linear combination,
two zero boundary conditions give two equations for and the ratio c cY/cJ.51 (Due to the oscillating
character of both Bessel functions, these conditions would be typically satisfied by an infinite set of
discrete pairs {, c}.) Note, however, that generally none of these pairs would correspond to zeros of
either Jn or Yn, so having an analog of Table 1 for the latter function would not help much. Hence, even
the simplest problems of this kind (like the one shown in Fig. 20) typically require the numerical
solution of transcendental algebraic equations.
51A pair of independent linear functions, used for the representation of the general solution of the Bessel
equation, may be also chosen differently, using the so-called Hankel functions
H n(1, 2 ) ( ) J n ( ) iYn ( ) .
For representing the general solution of Eq. (130), this alternative is completely similar, for example, to using the
pair of complex functions exp{ix} cos x isin x instead of the pair of real functions {cos x, sin x} for
the representation of the general solution of Eq. (89) for X(x).
Chapter 2 Page 39 of 68
Essential Graduate Physics EM: Classical Electrodynamics
In order to complete the discussion of variable separation in the cylindrical coordinates, one
more issue to address is the so-called modified Bessel functions: of the first kind, I(), and of the second
kind, K(). They are two linearly independent solutions of the modified Bessel equation,
d 2R 1 dR 2 Modified
1 R 0 , (2.155)
d 2 d 2
Bessel
equation
which differs from Eq. (130) “only” by the sign of one of its terms. Figure 21 shows a simple problem
that leads (among many others) to this equation: a round thin conducting cylindrical pipe is sliced,
perpendicular to its axis, to rings of equal height h, which are kept at equal but sign-alternating
potentials.
z
h V / 2
V / 2
t Fig. 2.21. A typical boundary problem whose
V / 2 solution may be conveniently described in
terms of the modified Bessel functions.
If the system is very long (formally, infinite) in the z-direction, we may use the variable
separation method for the solution of this problem, but now evidently need periodic (rather than
exponential) solutions along the z-axis, i.e. linear combinations of sin kz and cos kz with various real
values of the constant k. Separating the variables, we arrive at a differential equation similar to Eq.
(129), but with the negative sign before the separation constant:
d 2R 1 dR 2
( k 2
)R 0 . (2.156)
d 2 d 2
The same radial coordinate’s normalization, k, immediately leads us to Eq. (155), and hence (for
= n) to the modified Bessel functions In() and Kn().
Figure 22 shows the behavior of such functions, of a few lowest orders. One can see that at
0 the behavior is virtually similar to that of the “usual” Bessel functions – cf. Eqs. (132) and (152), with
Kn() multiplied (by purely historical reasons) by an additional coefficient, /2:
n ln , for n 0,
1 2
I n ( ) , K n ( )
n (2.157)
n! 2 (n 1)! , for n 0,
2 2
However, the asymptotic behavior of the modified functions is very much different, with In(x)
exponentially growing, and Kn() exponentially dropping at :
Chapter 2 Page 40 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1/ 2 1/ 2
1
I n ( ) e , K n ( ) e . (2.158)
2 2
3 3
2 2
I n ( ) n0 1 2 K n ( )
1 1
52 These complex functions still obey the general relations (143) and (146), with replaced with z.
53In the QM part of this series we will run into the so-called spherical Bessel functions jn() and yn(), which may
be expressed via the Bessel functions of semi-integer orders. Surprisingly enough, these functions turn out to be
simpler than Jn() and Yn().
Chapter 2 Page 41 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Let us look for a solution of this equation in the following variable-separated form:
R (r )
k P (cos )F ( ) , (2.162)
r
Separating the variables one by one, starting from , just like this has been done in cylindrical
coordinates, we get the following equations for the partial functions participating in this solution:
d 2R l (l 1)
R 0, (2.163)
dr 2 r2
d 2 dP 2
d (1 ) d
l (l 1) P 0 ,
1 2
(2.164)
d 2F
2F 0 , (2.165)
d 2
where cos is a new variable used in lieu of (so –1 +1), while 2 and l(l+1) are the
separation constants. (The reason for the selection of the latter one in this form will be clear in a
minute.)
One can see that Eq. (165) is very simple, and is absolutely similar to the Eq. (107) we have got
for the polar and cylindrical coordinates. Moreover, the equation for the radial functions is simpler than
in the cylindrical coordinates. Indeed, let us look for its partial solution in the form cr – just as we have
done with Eq. (106). Plugging this solution into Eq. (163), we immediately get the following condition
on the parameter :
1 l (l 1) . (2.166)
This quadratic equation has two roots, = l + 1 and = – l, so the general solution of Eq. (163) is
bl
R al r l 1 . (2.167)
rl
However, the general solution of Eq. (164) (called either the general or associated Legendre
equation) cannot be expressed via what is usually called elementary functions.55 Let us start its
discussion from the axially-symmetric case when / =0. This means F() = const, and thus = 0, so
Eq. (164) is reduced to the so-called Legendre differential equation:
Chapter 2 Page 42 of 68
Essential Graduate Physics EM: Classical Electrodynamics
d 2 dP
(1 ) d l (l 1)P 0 .
Legendre (2.168)
equation
d
One can readily verify that the solutions of this equation for integer values of l are specific (Legendre)
polynomials56 that may be described by the following Rodrigues’ formula:
Legendre 1 dl
polynomials P l ( ) ( 2 1) l , with l 0, 1, 2,... . (2.169)
2 l l! d l
According to this formula, the first few Legendre polynomials are pretty simple:
P 0 ( ) 1,
P1 ( ) ,
1
P 2 ( )
2
3 2 1 , (2.170)
1
P 3 ( ) 5 3 3 ,
2
P 4 ( ) 35 4 30 2 3,..,
1
8
though such explicit expressions become more and more bulky as l is increased. As Fig. 23 shows, all
these polynomials, which are defined on the [-1, +1] segment, end at the same point: Pl(+1) = + 1, while
starting either at the same point or at the opposite point: Pl(-1) = (-1)l. Between these two endpoints, the
lth Legendre polynomial has l zeros. It is straightforward to use Eq. (169) to prove that these
polynomials form a full, orthogonal set of functions, with the following normalization rule:
1
2
P ( )P
1
l l' ( )d ll ,
2l 1 '
(2.171)
so any function f() defined on the segment [-1, +1] may be represented as a unique series over the
polynomials.57
Thus, taking into account the additional division by r in Eq. (162), the general solution of any
Variable
axially symmetric Laplace problem may be represented as
separation
bl
(r , ) al r l
in spherical
coordinates P l (cos ) . (2.172)
(for axial l 0 r l 1
symmetry)
Note a strong similarity between this solution and Eq. (112) for the 2D Laplace problem in the polar
coordinates. However, besides the difference in the angular functions, there is also a difference (by one)
in the power of the second radial function, and this difference immediately shows up in problem
solutions.
56 Just for reference: if l is not an integer, the general solution of Eq. (2.168) may be represented as a linear
combination of the so-called Legendre functions (not polynomials!) of the first and second kind, Pl() and Ql().
57 This is why, at least for the purposes of this course, there is no good reason for pursuing (more complicated)
solutions to Eq. (168) for non-integer values of l, mentioned in the previous footnote.
Chapter 2 Page 43 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1
l 0
l 3
P l ( ) l4
l2
Fig. 2.23. A few lowest Legendre
l 1
polynomials Pl().
1
1 0 1
Indeed, let us solve a problem similar to that shown in Fig. 15: find the electric field around a
conducting sphere of radius R, placed into an initially uniform external field E0 (whose direction I will
now take for the z-axis) – see Fig. 24a.
(a) (b)
z E0
R
0
Fig. 2.24. Conducting sphere in a uniform electric field: (a) the problem’s geometry, and (b) the
equipotential surface pattern given by Eq. (176). The pattern is qualitatively similar but
quantitatively different from that for the conducting cylinder in a perpendicular field – cf. Fig. 15.
If we select the arbitrary constant in the electrostatic potential so that z=0 = 0, then in Eq. (172)
we should take a0 = b0 = 0. Now, just as has been argued for the cylindrical case, at r >> R the potential
should approach that of the uniform field:
E 0 z E 0 r cos , (2.173)
so in Eq. (172), only one of the coefficients al survives: al = –E0l,1. As a result, from the boundary
condition on the surface, (R,) = 0, we get the following equation for the coefficients bl:
b b
E 0 R 12 cos ll1 P l (cos ) 0 . (2.174)
R l 2 R
Chapter 2 Page 44 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Now repeating the argumentation that led to Eq. (117), we may conclude that Eq. (174) is satisfied if
bl E 0 R 3 l ,1 , (2.175)
so, finally, Eq. (172) is reduced to
R3
E0 r cos . (2.176)
r 2
This distribution, shown in Fig. 24b, is very similar to Eq. (117) for the cylindrical case (cf. Fig. 15b,
with the account for a different plot orientation), but with a different power of the radius in the second
term. This difference leads to a quantitatively different distribution of the surface electric field:
En r R 3E 0 cos , (2.177)
r
so its maximal value is a factor of 3 (rather than 2) larger than the external field.
Now let me briefly (mostly just for the reader’s reference) mention the Laplace equation
solutions in the general case – with no axial symmetry. If the conductor-free space surrounds the origin
from all sides, the solutions to Eq. (165) have to be 2-periodic, and hence = n = 0, 1, 2,…
Mathematics says that Eq. (164) with integer = n and a fixed integer l has a solution only for a
limited range of n:58
l n l . (2.178)
These solutions are called associated Legendre functions (generally, they are not polynomials). For n
0, these functions may be defined via the Legendre polynomials, using the following formula:59
dn
P l n (1) n (1 2 ) n / 2 P l ( ) . (2.179)
d n
On the segment [-1, +1], each set of the associated Legendre functions with a fixed index n and non-
negative values of l form a full, orthogonal set, with the normalization relation,
1
2 (l n)!
P ( )P l n' ( )d ll' ,
n
l (2.180)
1
2l 1 (l n)!
that is evidently a generalization of Eq. (171).
Since these relations may seem a bit intimidating, let me write down explicit expressions for a
few Pl (cos) with the three lowest values of l and n 0, which are most important for applications.
n
l 0: P 00 cos 1 ; (2.181)
58 In quantum mechanics, the letter n is typically reserved for the “principal quantum number”, while the
azimuthal functions are numbered by index m. However, here I will keep using n as their index because, for this
course’s purposes, this seems more logical, in view of the similarity of the spherical and cylindrical functions.
59 Note that some texts use different choices for the front factor (called the Condon-Shortley phase) in the
functions Plm, which do not affect the final results for the spherical harmonics Ylm.
Chapter 2 Page 45 of 68
Essential Graduate Physics EM: Classical Electrodynamics
P l n
1
l 0
l 1
l 2
l 3
l 4
Using the associated Legendre functions, the general solution (162) to the Laplace equation in
the spherical coordinates may be expressed as Variable
separation
bl l n in spherical
(r , , ) al r l P l (cos )Fn ( ), Fn ( ) c n cos n s n sin n . (2.184) coordinates
l 0 r l 1 n 0 (general
case)
Since the difference between the angles and is somewhat artificial, physicists prefer to think not in
terms of the functions P and F in separation, but directly about their products that participate in this
solution.60
60In quantum mechanics, it is more convenient to use a slightly different alternative set of basic functions of the
same problem, namely the following complex functions called the spherical harmonics:
1/ 2
2l 1 (l n)!
Yl ( , )
n
P l n (cos )e in ,
4 (l n)!
which are defined for both positive and negative n (within the limits –l n +l) – see, e.g., QM Secs. 3.6 and 5.6.
(Note again that in that field, our index n is traditionally denoted as m, and called the magnetic quantum number.)
Chapter 2 Page 46 of 68
Essential Graduate Physics EM: Classical Electrodynamics
As a rare exception for my courses, to save time I will skip giving an example of using the
associated Legendre functions in electrostatics, because quite a few examples of these functions’
applications will be given in the quantum mechanics part of this series.
r' r
d
0 r2
0
Fig. 2.26. The simplest problem readily solvable by the
d charge image method. The points’ colors are used, as
r"
before, to denote the charges of the original (red) and
q opposite (blue) sign.
Let us prove that its solution, above the conductor’s surface (z 0), may be represented as:
1 q q q 1 1
(r ) , (2.185)
4 0 r1 r2 4 0 r r' r r"
or in a more explicit form, using the cylindrical coordinates shown in Fig. 26:
q 1 1
(r ) 2 1/ 2
, (2.186)
4 0 ( z d ) 2
1/ 2
2 (z d )2
where is the distance of the field observation point r from the “vertical” line on which the charge is
located. Indeed, this solution satisfies both the boundary condition = 0 at the surface of the conductor
(z = 0), and the Poisson equation (1.41), with the single -functional source at point r’ = {0, 0, +d} on
its right-hand side, because the second singularity of the solution, at point r” = {0, 0, –d}, is outside the
region of the solution’s validity (z 0). Physically, this solution may be interpreted as the sum of the
fields of the actual charge (+q) at point r’, and an equal but opposite charge (–q) at the “mirror image”
point r” (Fig. 26). This is the basic idea of the charge image method.
Before moving on to more complex problems, let us discuss the situation shown in Fig. 26 in a
little bit more detail, due to its fundamental importance. First, we can use Eqs. (3) and (186) to calculate
the surface charge density:
Chapter 2 Page 47 of 68
Essential Graduate Physics EM: Classical Electrodynamics
q 1 1 q 2d
0 2 . (2.187)
z z 0
4 z ( z d ) 2
1/ 2
(z d )
2
2 1/ 2
z 0
4 d 2
2
3/ 2
This integral may be easily worked out using the substitution 2/d2 (giving d = 2d/d2):
q d
Q q. (2.189)
2 0 13 / 2
This result is very natural: the conductor brings as much surface charge from its interior to the surface as
necessary to fully compensate for the initial charge (+q) and hence kill the electric field at large
distances as efficiently as possible, hence reducing the total electrostatic energy (1.65) to the lowest
possible value.
For a better feeling of this polarization charge of the surface, let us take our calculations to the
extreme – to the q equal to one elementary change e, and place a particle with this charge (for example,
a proton) at a macroscopic distance – say 1 m – from the conductor’s surface. Then, according to Eq.
(189), the total polarization charge of the surface equals that of an electron, and according to Eq. (187),
its spatial extent is of the order of d2 = 1 m2. This means that if we consider a much smaller part of the
surface, A << d2, its polarization charge magnitude Q = A is much less than one electron! For
example, Eq. (187) shows that the polarization charge of quite a macroscopic area A = 1 cm2 right
under the initial charge ( = 0) is eA/2d2 1.610-5 e. Can this be true, or our theory is somehow
limited to the charges q much larger than e? (After all, the theory is substantially based on the
approximate macroscopic model (1); maybe it is the culprit?)
Surprisingly enough, the answer to this question has become clear (at least to some physicists :-)
only as late as in the mid-1980s when several experiments demonstrated, and theorists accepted (some
of them rather grudgingly) that the usual polarization charge formulas are valid for elementary charges
as well, i.e., such the polarization charge Q of a macroscopic surface area may differ from a multiple
of e. The underlying reason for this paradox is the physical nature of the polarization charge of a
conductor’s surface: as was discussed in Sec. 1, it is due not to new charged particles brought into the
conductor (such charge would be in fact a multiple of e), but to a small shift of the free charges of a
conductor by a very small distance from their equilibrium positions that they had in the absence of the
external field induced by charge q. This shift is not quantized, at least on the scale relevant to our
problem, and hence neither is Q.
This understanding has paved the way for the invention and experimental demonstration of
several new devices including so-called single-electron transistors,61 which may be used, in particular,
for ultrasensitive measurement of polarization charges as small as ~10-6 e. Another important class of
single-electron devices is the dc and ac current standards based on the fundamental relation I = –ef,
61 Actually, this term (for which the author of these notes may be blamed :-) is misleading: the operation of the
“single-electron transistor” is based on the interplay of discrete charges (multiples of e) transferred between
conductors, and sub-single-electron polarization charges – see, e.g., K. Likharev, Proc. IEEE 87, 606 (1999).
Chapter 2 Page 48 of 68
Essential Graduate Physics EM: Classical Electrodynamics
where I is the dc current carried by electrons transferred with the frequency f. The experimentally
achieved62 relative accuracy of such standards is of the order of 10-7, and is not too far from that
provided by the competing approach based on a combination of the Josephson effect and the quantum
Hall effect.63
Second, let us find the potential energy U of the charge-to-surface interaction. For that, we may
use the value of the electrostatic potential (185) at the point of the charge itself (r = r’), of course
ignoring the infinite potential created by the real charge, so the remaining potential is that of the image
charge
1 q
image (r' ) . (2.190)
4 0 2d
Looking at the electrostatic potential’s definition given by Eq. (1.31), it may be tempting to immediately
write U = qimage = – (1/40)(q2/2d) [WRONG!], but this would be incorrect. The reason is that the
potential image is not independent of q, but is actually induced by this charge. This is why the correct
approach is to calculate U from Eq. (1.61), with just one term:
1 1 q2
U qimage , (2.191)
2 4 0 4d
giving twice lower energy than the wrong result cited above. To double-check Eq. (191), and also get a
better feeling of the factor ½ that distinguishes it from the wrong guess, we can calculate U as the
integral of the force exerted on the charge by the conductor’s surface charge (i.e., in our formalism, by
the image charge):
d d
1 q2 1 q2
U F ( z )dz
4 0 (2 z ) 2
dz . (2.192)
4 0 4d
This calculation clearly accounts for the gradual build-up of the force F, as the real charge is being
brought from afar (where we have opted for U =0) toward the surface.
This result has several important applications. For example, let us plot the electrostatic energy U
of an electron, i.e. a particle with charge q = –e, near a metallic surface, as a function of d. For that, we
may use Eq. (191) until our macroscopic model (1) becomes invalid, and U transitions to some negative
constant value (-) inside the conductor – see Fig. 27a. Since our calculation was for an electron with
zero potential energy at infinity, at relatively low temperatures, kBT << , electrons in metals may
occupy only the states with energies below – (the so-called Fermi level64). The positive constant is
called the workfunction because it describes the smallest work needed to remove the electron from a
metal. As was discussed in Sec. 1, in good metals the electric field screening takes place at interatomic
distances a0 ~ 10-10 m. Plugging d =110-10 m and q = –e –1.610-19 C into Eq. (191), we get
610–19 J 3.5 eV. This crude estimate is in surprisingly good agreement with the experimental values
of the workfunction, ranging between 4 and 5 eV for most metals.65
62 See, e.g., M. Keller et al., Appl. Phys. Lett. 69, 1804 (1996) ; F. Stein et al., Metrologia 54, 1 (2017).
63 J. Brun-Pickard et al., Phys. Rev. X 6, 041051 (2016).
64 More discussion of these states may be found in SM Secs. 3.3 and 6.3.
65 More discussion of the workfunction, and its effect on the electrons’ kinetics, is given in SM Sec. 6.3.
Chapter 2 Page 49 of 68
Essential Graduate Physics EM: Classical Electrodynamics
U (a) U (b)
a0
0 d 0 d
eE 0 d
Fig. 2.27. (a) The origin
of the workfunction, and
1 (b) the field emission of
U
d electrons (schematically).
Next, let us consider the effect of an additional uniform external electric field E0 applied
normally to a metallic surface, on this potential profile. For that, we may the potential energy that the
field gives to the electron at distance d from the surface, Uext = –eE0d, to that created by the image
charge. (As we know from Eq. (1.53), since the field E0 is independent of the electron’s position, its
recalculation into the potential energy does not require the coefficient ½.) As a result, the potential
energy of an electron near the surface becomes
1 e2
U ( d ) eE 0 d , for d >> a0, (2.193)
4 0 4d
with a similar crossover to U = – inside the conductor – see Fig. 27b. One can see that at the
appropriate sign, and a sufficient magnitude of the applied field, it lowers the potential barrier that
prevents electrons from leaving the conductor. At E0 ~ /a0 (for metals, ~1010 V/m), this suppression
becomes so strong that electrons with energies at, and just below the Fermi level start quantum-
mechanical tunneling through the remaining thin barrier. This is the field electron emission (or just
“field emission”) effect, which is used in vacuum electronics to provide efficient cathodes that do not
require heating to high temperatures.66
Returning to the basic electrostatics, let us find some other conductor geometries where the
method of charge images may be effectively applied. First, let us consider a right-angle corner (Fig.
28a). Reflecting the initial charge in the vertical plane, we get the image shown in the top left corner of
that panel. This image makes the boundary condition = const satisfied on the vertical surface of the
corner. However, for the same to be true on the horizontal surface, we have to reflect both the initial
charge and the image charge in the horizontal plane, flipping their signs. The final configuration of four
charges, shown in Fig. 28a, satisfies all boundary conditions. The resulting potential distribution may be
readily written as an evident generalization of Eq. (185). From it, the electric field and electric charge
distributions, and the potential energy and forces acting on the charge may be calculated exactly as
above – an easy exercise left for the reader.
Next, consider a corner with the angle /4 (Fig. 28b). Here we need to repeat the reflection
operation not two but four times before we arrive at the final pattern of eight positive and negative
charges. (Any attempt to continue this process would lead to overlap with the already existing charges.)
66 The practical use of such “cold” cathodes is affected by the fact that, as it follows from our discussion in Sec. 4,
any nanoscale irregularity of a conducting surface (a protrusion, an atomic cluster, or even a single “adatom”
stuck to it) may cause a strong increase of the local field well above the applied uniform field E0, making the
electron emission reproducibility and stability in time significant challenges. In addition, the impact-ionization
effects may lead to avalanche-type electric breakdown at dc fields as low as ~3106 V/m.
Chapter 2 Page 50 of 68
Essential Graduate Physics EM: Classical Electrodynamics
This reasoning may be readily extended to corners of angles = /n, with any integer n, which require
2n charges (including the initial one) to satisfy all the boundary conditions.
(a) (b) (c)
d 2a d
q q
q q
q q
q q
q q
q 0 a 2a
(d) (e)
Fig. 2.28. The charge images for (a, b) the corners with angles /2 and /4, (c) a plane capacitor,
and (d) a rectangular box; (e) typical equipotential surfaces for the last system.
Some configurations require an infinite number of images but are still tractable. The most
important of them is a system of two parallel conducting surfaces, i.e. an unbiased plane capacitor of
infinite area (Fig. 28c). Here the repeated reflection leads to an infinite system of charges q at points
x j 2 aj d , (2.194)
where d (with 0 < d < a) is the position of the initial charge, and j is an arbitrary integer. The resulting
infinite sum for the potential of the real charge q, created by the field of its images,
1 q q q 1 d2
1
(d ) 3 j j , (2.195)
4 0 2d j 0 d x j
4 0 2d a j 1
2
2
(d / a)
is converging (in its last form) very fast. For example, the exact value, (a/2) = –2ln2(q/40a), differs
by less than 5% from the approximation using just the first term of the sum.
Chapter 2 Page 51 of 68
Essential Graduate Physics EM: Classical Electrodynamics
The same method may be applied to 2D (cylindrical) and 3D rectangular conducting boxes that
require, respectively, 2D or 3D infinite rectangular lattices of images; for example in a 3D box with
sides a, b, and c, charges q are located at points (Fig. 28d)
r jkl 2 ja 2kb 2lc r' , (2.196)
where r’ is the location of the initial (real) charge, and j, k, and l are arbitrary integers. Figure 28e shows
a typical result of the summation of the potentials of this charge set, including the real one, in a 2D box
(within the plane of the real charge). One can see that the equipotential surfaces, concentric near the
charge, are naturally leaning along the conducting walls of the box, which have to be equipotential.
Even more surprisingly, the image charge method works very efficiently not only for rectilinear
geometries but also for spherical ones. Indeed, let us consider a point charge q at distance d from the
center of a conducting, grounded sphere of radius R (Fig. 29a), and try to satisfy the boundary condition
= 0 for the electrostatic potential on the sphere’s surface using an imaginary charge q’ located at some
point beyond the surface, i.e. inside the sphere.
(a) (b)
q, d
r
q', d'
R
Fig. 2.29. Method of charge images for
0 a conducting sphere: (a) the idea, and
(b) the resulting potential distribution
in the central plane containing the
charge, for the particular case d = 2 R.
From the problem’s symmetry, it is clear that the point should be at the line passing through the
real charge and the sphere’s center, at some distance d’ from the center. Then the total potential created
by the two charges at an arbitrary point of free space, i.e. at r R (Fig. 29a) is
1 q q'
(r , ) 2 . (2.197)
4 0 r d 2 2rd cos
1/ 2
r 2
d' 2 2rd' cos
1/ 2
This expression shows that we can make the two involved fractions equal and opposite at all points on
the sphere’s surface (i.e. for any at r = R) if we take67
R2 R
d' , q' q. (2.198)
d d
Since the solution of any Poisson boundary problem is unique, Eqs. (197) and (198) give us the final
solution for this problem. Fig. 29b shows a typical equipotential pattern following from this solution. It
may be surprising how formulas that simple may describe such an elaborate field distribution.
67 In geometry, such points with dd’ = R2, are referred to as the result of mutual inversion in a sphere of radius R.
Chapter 2 Page 52 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Now we can calculate the total charge Q on the grounded sphere’s surface, induced by the
external charge q. We could do this, as we have done for the conducting plane, by the brute-force
integration of the surface charge density = –0/rr = R. It is more elegant, however, to use the
following Gauss law argument. Equality (197) is valid (at r R) regardless of whether we are dealing
with our real problem (charge q and the conducting sphere) or with the equivalent charge configuration
– with the point charges q and q’, but no sphere at all. Hence, according to Eq. (1.16), the Gaussian
integral over a surface with radius r = R + 0, and the total charge inside the sphere should be also the
same. Hence we immediately get
R
Q q' q . (2.199)
d
A similar argumentation may be used to calculate the charge-to-sphere interaction force:
q' q2 R 1 q2 Rd
F qEimage (d ) q . (2.200)
4 0 (d d' ) 2
4 0 d (d R / d )
2 2
4 0 (d R 2 ) 2
2
(Note that this expression is legitimate only at d > R.) At large distances, d >> R, this attractive force
decreases as 1/d3. This unusual dependence arises because, as Eq. (199) specifies, the induced charge of
the sphere, responsible for the force, is not constant but decreases as 1/d. In the next chapter, we will see
that such force is also typical for the interaction between a point charge and a dipole.
All previous formulas were for a sphere that is grounded to keep its potential equal to zero. But
what if we keep the sphere galvanically insulated, so its net charge is fixed, for example, equals zero?
Instead of solving this problem from scratch, let us use (again!) the almighty linear superposition
principle. For that, we may add to the previous problem an additional charge, equal to –Q = –q’, to the
sphere, and argue that this addition gives, at all points, an additional, spherically symmetric potential
that does not depend on the potential induced by the external charge q, and was calculated in Sec. 1.2 –
see Eq. (1.19). For the interaction force, such addition yields
qq' qq' q2 Rd R
F 2 3. (2.201)
4 0 (d d ' ) 2 4πε 0 d 2 4 0 (d R )
2 2
d
At large distances, the two terms proportional to 1/d3 cancel each other, giving F 1/d5, so the
potential energy of such interaction behaves as U 1/d4. Such a rapid force decay is due to the fact that
the field of the uncharged sphere is equivalent to that of two (equal and opposite) induced charges +q’
and –q’, and the distance between them (d – d’ = d – R2/d) tends to zero at d .
Chapter 2 Page 53 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1 1
q G(r, r ) 4 (r' )G(r, r' )d
Spatial
(r ) 3
r' . (2.203) Green’s
4 0
j j
j 0 function
The function G(r, r’) is called the (spatial) Green’s function – the notion very fruitful and hence popular
in all fields of physics.68 Evidently, as Eq. (1.35) shows, in the unlimited free space
1
G (r, r' ) , (2.204)
r r'
i.e. the Green’s function depends only on one scalar argument – the distance between the field-
observation point r and the field-source (charge) point r’. However, as soon as there are conductors
around, the situation changes. In this course, I will only discuss Green’s functions defined to vanish as
soon as the radius-vector r points to the surface (S) of any conductor:69
With this definition, it is straightforward to deduce the Green’s functions for the solutions of the
last section’s problems in which conductors were grounded, i.e. had potential = 0. For example, for a
semi-space z 0 limited by a grounded conducting plane z = 0 (Fig. 26), Eq. (185) yields
1 1
G , with ρ" ρ' and z" z' , (2.206)
r r' r r"
where is the 2D radius vector. We see that in the presence of conductors (and, as we will see later, any
other polarizable media), Green’s function may depend not only on the difference r – r’, but on each of
these two arguments in a specific way.
So far, this is just re-naming our old results. The really non-trivial result of the Green’s function
formalism in electrostatics is that, somewhat counter-intuitively, the knowledge of this function for a
system with grounded conductors (Fig. 30a) enables the calculation of the field created by voltage-
biased conductors (Fig. 30b), with the same geometry. To show this, let us use the so-called Green’s
theorem of the vector calculus.70 The theorem states that for any two scalar, differentiable functions f(r)
and g(r), and any volume V,
f
g g 2 f d 3 r f g gf n d 2 r ,
2
(2.207)
V S
where S is the surface limiting the volume. Applying the theorem to the electrostatic potential (r) and
the Green’s function G (also considered as a function of r), let us use the Poisson equation (1.41) to
replace 2 with (-/0), and notice that G, considered as a function of r, obeys the Poisson equation
with the -functional source:
2 G (r, r' ) 4 (r r' ) . (2.208)
68 See, e.g., CM Sec. 5.1, QM Secs. 2.2 and 7.4, and SM Sec. 5.5. Note that the numerical coefficient in Eq. (202)
(and hence all resulting formulas) is the matter of convention; this choice does not affect the final results.
69 G so defined is sometimes called the Dirichlet function.
70 See, e.g., MA Eq. (12.3). Actually, this theorem is a ready corollary of the better-known divergence (“Gauss”)
theorem, MA Eq. (12.2).
Chapter 2 Page 54 of 68
Essential Graduate Physics EM: Classical Electrodynamics
(Indeed, according to its definition (202), this function may be formally considered as the field of a
point charge q = 40.) Now swapping the notation of the radius-vectors, r r’, and using the Green’s
function symmetry, G(r, r’) = G(r’, r),71 we get
(a) (b)
0 1
q2 0 q2 2
qj qj
q1 q1
Fig. 2.30. Green’s function method allows the solution of a simpler boundary problem (a) to be
used for the solution of a more complex problem (b), for the same conductor geometry.
Let us apply this relation to the volume V of free space between the conductors, and the
boundary S drawn immediately outside of their surfaces. In this case, by its definition, Green’s function
G(r, r’) vanishes at the conductor surface, i.e. at r S – see Eq. (205). Now changing the sign of ∂n’ (so
it would be the outer normal for conductors, rather than free space volume V), dividing all terms by 4,
and partitioning the total surface S into the parts (numbered by index j) corresponding to different
conductors (possibly, kept at different potentials k), we finally arrive at the famous result:72
1 1 G (r, r' ) 2
Potential
via Green’s
(r )
4 0 V (r' )G (r, r' )d
3
r'
4
k
n'
d r' . (2.210)
function k Sk
While the first term on the right-hand side of this relation is a direct and evident expression of
the superposition principle, given by Eq. (203), the second term is highly non-trivial: it describes the
effect of conductors with arbitrary potentials k (Fig. 30b), using the Green’s function calculated for the
similar system with grounded conductors, i.e. with all k = 0 (Fig. 30a). Let me emphasize that since our
volume V excludes conductors, the first term on the right-hand side of Eq. (210) includes only the stand-
alone charges in the system (in Fig. 30, marked q1, q2, etc.), but not the surface charges of the conductors
– which are taken into account, indirectly, by the second term.
In order to illustrate what a powerful tool Eq. (210) is, let us use to calculate the electrostatic
field in two systems. In the first of them, a plane, circular, conducting disk of radius R, separated with a
very thin cut from the remaining conducting plane, is biased with potential = V, while the rest of the
plane is grounded – see Fig. 31.
71 This symmetry, evident for the particular cases (204) and (206), may be readily proved for the general case by
applying Eq. (207) to the functions f (r) G(r, r’) and g(r) G(r, r”). With this substitution, the left-hand side of
that equality becomes equal to –4[G(r”, r’) – G(r’, r”)], while the right-hand side is zero, due to Eq. (205).
72 In some textbooks, the sign before the surface integral is negative, because their authors use the outer normal to
the free-space region V rather than those occupied by conductors – as I do.
Chapter 2 Page 55 of 68
Essential Graduate Physics EM: Classical Electrodynamics
z
r'
r
0 V 0
0 R
r"
Fig. 2.31. A voltage-biased conducting disk separated from the rest of a conducting plane.
If the width of the gap between the disk and the rest of the plane is negligible, we may apply Eq.
(210) without stand-alone charges, (r’) = 0, and the Green’s function for the uncut plane – see Eq.
(206).73 In the cylindrical coordinates, with the origin at the disk’s center (Fig. 31), the function is
1
G (r, r' )
2
' 2 2 ' cos( ' ) ( z z' ) 2 1/ 2
1
. (2.211)
2
' 2 2 ' cos( ' ) ( z z' ) 2 1/ 2
(The sum of the first three terms under each square root in Eq. (211) is just the squared distance between
the horizontal projections and ’ of the vectors r and r’ (or r”) correspondingly, while the last terms
are the squares of their vertical displacements.)
Now we can readily calculate the derivative participating in Eq. (210), for z 0:
G G 2z
. (2.212)
n'
S
z' z' 0
2
' 2 ' cos( ' ) z 2
2
3/ 2
Due to the axial symmetry of the system, we may take for zero. With this, Eqs. (210) and (212) yield
2
V G (r, r ' ) 2 Vz
R
'd'
4 S n' d r' 2 d'
0 0
2
' 2 ' cos ' z 2 3 / 2
2
. (2.213)
This integral is not overly pleasing, but may be readily worked out at least for points on the symmetry
axis ( = 0, z 0):74
R2 / z2
R
'd ' V d z
Vz V 1 2 , (2.214)
0 ' 2
z 2
3 / 2
2 0 13 / 2
R z 2
1 / 2
This result shows that if z 0, the potential tends to V (as it should), while at z >> R,
R2
V . (2.215)
2z 2
73 Indeed, if all parts of the cut plane are grounded, a narrow cut does not change the field distribution, and hence
the Green’s function, significantly.
74 There is no need to repeat the calculation for z 0: from the symmetry of the problem, (–z) = (z).
Chapter 2 Page 56 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Now let us use the same Eq. (210) to solve the (in :-)famous problem of the cut sphere (Fig. 32).
Again, if the gap between the two conducting semi-spheres is very thin (t << R), we may use Green’s
function for the grounded (and uncut) sphere. For a particular case r’ = dnz, this function follows from
Eqs. (197)-(198); generalizing the former relation for an arbitrary direction of vector r’, we get
1 R / r'
G , for r , r' R , (2.216)
r 2
r' 2rr' cos
2
1/ 2
r 2
( R / r' ) 2r ( R 2 / r' ) cos
2 2
1/ 2
where is the angle between the vectors r and r’, and hence r” – see Fig. 32.
z
R r
V / 2 r'
r"
t R
and plugging it into Eq. (210), we see that the integration is again easy only for the field on the
symmetry axis (where r = znz, and = ’), giving:
V z2 R2
1 1/ 2
, for 0 . (2.218)
2
z z 2 R 2
For z R, this relation yields V/2 (as it should), while for z/R ,
3R 2
V. (2.219)
4z 2
As will be discussed in the next chapter, such a field is typical for an electric dipole.
Chapter 2 Page 57 of 68
Essential Graduate Physics EM: Classical Electrodynamics
Though software packages offering their automatic numerical solution are abundant nowadays,75 it is
important for every educated physicist to understand “what is under the hood”, at least because most
universal programs exhibit mediocre performance in comparison with custom codes written for
particular problems, and sometimes do not converge at all, especially for fast-changing (say,
exponential) functions. The very brief discussion presented here76 is a (hopefully, useful) fast glance
under the hood, though it is certainly insufficient for professional numerical research work.
The simplest of the numerical approaches to the solution of partial differential equations, such as
the Poisson or the Laplace equations (1.41)-(1.42), is the finite-difference method,77 in which the sought
continuous scalar function f(r), such as the potential (r), is represented by its values in discrete points
of a rectangular grid (frequently called mesh) of the corresponding dimensionality – see Fig. 33.
Each partial second derivative of the function is approximated by the formula that readily
follows from linear approximations of the function f and then its partial derivatives – see Fig. 33a:
2 f f 1 f
f 1 f f
f f f f 2 f
, (2.220)
r j
2
r j r j h r j
j
r h / 2 r j
r j h / 2 h h
h h 2
where f f(rj + h) and f f(rj – h). (The relative error of this approximation is of the order of
h4∂4f/∂rj4.) As a result, the action of a 2D Laplace operator on the function f may be approximated as
2 f 2 f f f 2 f f f 2 f f f f f 4 f
, (2.221)
x 2 y 2 h2 h2 h2
and of the 3D operator, as
2 f 2 f 2 f f f f f f f 6 f
2 2 . (2.222)
x 2
y z h2
(The notation used in Eqs. (221)-(222) should be clear from Figs. 33b and 33c, respectively.)
Chapter 2 Page 58 of 68
Essential Graduate Physics EM: Classical Electrodynamics
As a simple example, let us use this scheme to find the electrostatic potential distribution inside
a cylindrical box with conducting walls and square cross-section aa, using an extremely coarse mesh
with step h a / 2 (Fig. 34). In this case, our function, the electrostatic potential (x, y), equals zero at
the side and bottom walls, and V0 at the top lid, so, according to Eq. (221), the 2D Laplace equation may
be approximated as
0 0 V0 0 4
0. (2.223)
(a / 2) 2
The resulting value for the potential in the center of the box is = V0/4.
V0
a/2
a
Fig. 2.34. Numerically solving an internal 2D boundary
2 problem for a conducting, cylindrical box with a square
0 cross-section, using a very coarse mesh (with h = a/2).
Surprisingly, this is the exact value! This may be proved either by solving this problem by the
variable separation method, just as this has been done for a similar 3D problem in Sec. 5, or just from
the following Green’s-function argument. If all four walls of our 2D volume were biased to the voltage
V0, there would be no electric field in it at all, so the middle-point potential would be equal to V0 as well.
However, from the point of view of Eq. (210) with no bulk charge, (r) = 0, this result may be
legitimately viewed as the linear superposition of the four contributions of the potentials k = V0 of each
wall. Since for this symmetric geometry, the corresponding geometrical factors are equal, the
contribution of one wall, with k = 0 on all other walls (as in our current problem), has to equal V0/4.
For a similar 3D problem (a cubic box), with a similar 3D mesh, Eq. (222) yields
0 0 V0 0 0 0 6
0, (2.226)
( a / 2) 2
so = V0/6. Using the same Green’s-function argument, now for six walls of the cube, we see that this
result is also exact! (This fact also follows from our variable-separation result expressed by Eqs. (95)
and (99) with a = b = c.)
Though such exact results should be considered as a happy coincidence rather than the general
law, they still show that numerical methods, even with relatively crude meshes, may be more
computationally efficient than some “analytical” approaches, like the variable separation method with
its infinite-sum results that, in most cases, require computers anyway – at least for the result’s
comprehension and analysis.
A more powerful (but also much more complex) approach is the finite-element method in which
the discrete point mesh, typically with triangular cells, is (automatically) generated in accordance with
the system geometry.78 Such mesh generators provide higher point concentration near sharp convex
Chapter 2 Page 59 of 68
Essential Graduate Physics EM: Classical Electrodynamics
parts of conductor surfaces, where the field concentrates and hence the potential changes faster, thus
ensuring a better accuracy-to-speed tradeoff than the finite-difference methods on a uniform grid. The
price to pay for this improvement is the algorithm’s complexity which makes its adjustments much
harder. Unfortunately, in this series, I do not have time for going into the details of that method and have
to refer the reader to the special literature on this subject.79
QA S3
2.2. Electric charges QA and QB have been placed on two conducting S2
concentric spherical shells – see the figure on the right. What is the full S1
charge of each of the surfaces S1-S4?
C1 C2
2.3. Calculate the mutual capacitance between the terminals of the
lumped-capacitor circuit shown in the figure on the right. Analyze and C0
interpret the result for major particular cases.
C2 C1
C1 C1 C1
2.4. Calculate the mutual capacitance between the terminals
of the semi-infinite lumped-capacitor circuit shown in the figure on
the right, and find the law of the applied voltage’s decay along the C2 C2 C2
system. Analyze and interpret the result.
A1
2.5. A system of two thin conducting plates is located over a
A2
ground plane as shown in the figure on the right, where A1 and A2 are
the areas of the indicated plate parts, while d’ and d” are the distances 1
between them. Neglecting the fringe effects, calculate: d'
2
(i) the effective capacitance of each plate, and d"
(ii) their mutual capacitance.
79
See, e.g., either C. Johnson, Numerical Solution of Partial Differential Equations by the Finite Element
Method, Dover, 2009, or T. Hughes, The Finite Element Method, Dover, 2000.
Chapter 2 Page 60 of 68
Essential Graduate Physics EM: Classical Electrodynamics
2.8. Use the Gauss law to calculate the mutual capacitance of the
following two-electrode systems, both with the cross-section shown in Fig. 7 ba
(reproduced on the right):
(i) a conducting sphere in the center of a spherical cavity inside
another conductor, and 0 a
(ii) a long conducting round cylinder on the axis of a cylindrical cavity
inside another conductor, i.e. a coaxial cable. (In this case, we speak about the
capacitance per unit length).
Compare the results with those obtained in Sec. 2.2 using the Laplace
equation.
2.11. Using the results for a single thin round disk, obtained in Sec. 4,
R
consider a system of two such disks at a small distance d << R from each other
d
– see the figure on the right. In particular, calculate:
(i) the reciprocal capacitance matrix of the system,
Chapter 2 Page 61 of 68
Essential Graduate Physics EM: Classical Electrodynamics
2.13. Calculate the mutual capacitance (per unit length) between two similar, long, parallel
wires, each with a round cross-section of radius R, whose axes are separated by distance d > 2R. Explore
and interpret the result in the limits R 0 and R 2d.
Hint: You may like to use the 2D orthogonal bipolar coordinates {, } defined by the following
relations with the Cartesian coordinates {x, y}:
sinh sin
xa , ya , with , .
cosh cos cosh cos
In these coordinates, the Laplace operator is
1 2 2 .
2 2 cosh cos 2
a 2
2.14. Formulate 2D electrostatic problems that may be simply solved using each of the following
analytic functions of the complex variable z x + iy:
(i) w = ln z,
(ii) w = z1/2,
(iii) w = z + 1/z,
and solve these problems.
2.15. On each side of a cylindrical volume with a rectangular cross-section ab, with no electric
charges inside it, the electric field’s component normal to the side’s plane is constant, and also equal and
opposite to that on the opposite side. Calculate the distribution of the electric potential inside the
volume, provided that the magnitude of the normal components on the sides of length b equals E.
Suggest a practicable method to implement such potential distribution.
2.16. Complete the solution of the problem shown in Fig. 12, by calculating the distribution of
the surface charge on the semi-planes. Can you calculate the mutual capacitance between the semi-
planes (per unit length of the system)? If not, can you estimate it?
Chapter 2 Page 62 of 68
Essential Graduate Physics EM: Classical Electrodynamics
0
2.18. A gap of constant width w between two grounded conducting semi-spaces
is closed, from one side, with a conducting plunger biased with voltage V, so that the w
cross-section of the system looks like the figure on the right shows. Use the variable
separation method to calculate the distribution of the electrostatic potential within the
gap. V
V / 2
2.20. Solve Problem 17(i) by using the variable separation method, and compare the results.
2.21. Use the variable separation method to calculate the potential distribution above the plane
surface of a conductor, with a strip of width w separated by very thin cuts, and biased with voltage V –
see the figure below.
w
0 V 0
2.22. The previous problem is now modified: the cut-out and voltage-biased part of the
conducting plane is now not a strip, but a square with side w. Calculate the potential distribution above
the conductor’s surface.
Chapter 2 Page 63 of 68
Essential Graduate Physics EM: Classical Electrodynamics
V V V
2.23. Each electrode of a large plane capacitor is cut into
2 2 2
parallel long strips of equal width w, with very narrow gaps
between them. These strips are kept at alternating potentials, as w d
shown in the figure on the right. Use the variable separation
method to calculate the electrostatic potential distribution in space, V V
and explore the limit w << d. V
2 2 2
2.24. Complete the cylinder problem started in Sec. 7 (see Fig. 17), for the cases when the top
lid’s voltage is fixed as follows:
(i) V = V0 J1(11/R) sin, where 11 3.832 is the first root of the Bessel function J1();
(ii) V = V0 = const.
For both cases, calculate the electric field at the centers of the lower and upper lids. (For Task
(ii), an answer including series and/or integrals is acceptable.)
2.25. For the infinitely long periodic system sketched in Fig. 21, assuming that t << h, R:
(i) calculate and sketch the electrostatic potential’s distribution inside the system for various
values of the ratio R/h, and
(ii) simplify the results for the limit R/h 0.
2.27. Use the variable separation method to find the potential distribution inside and outside of a
thin spherical shell of radius R, with a fixed potential distribution on it: (R,,) = V0 sin cos.
2.28. A thin spherical shell carries an electric charge with areal density = 0cos. Calculate the
spatial distribution of the electrostatic potential and the electric field, both inside and outside the shell.
2.29. Use the variable separation method to solve the problem already considered in Sec. 10:
calculate the potential distribution both inside and outside of a thin spherical shell of radius R, separated
with a very thin cut along the central plane z = 0 into two halves, with voltage V applied between them –
see Fig. 32. Analyze the solution; in particular, compare the field at the z-axis, for z > R, with Eq. (218).
Hint: You may like to use the following integral of a Legendre polynomial with an odd index l =
1, 3, 5…= 2n – 1:80
80 As a reminder, the double factorial (also called “semifactorial”) operator (!!) is similar to the usual factorial
operator (!), but with the product limited to numbers of the same parity as its argument – in our particular case, of
the odd numbers in the numerator and even numbers in the denominator.
Chapter 2 Page 64 of 68
Essential Graduate Physics EM: Classical Electrodynamics
1
I n P2 n 1 d
1 1 3 5 3
... n 1
n 1 2n 3!! .
0
n! 2 2 2 2 2n 2n 2!!
z
V
d
2.30. Calculate, up to terms O(1/r2), the long-range electric field induced by
a split and voltage-biased conducting sphere – similar to that discussed in Sec. 10 R
0
(see Fig. 32) and in the previous problem, but with the cut’s plane at an arbitrary
distance d < R from the center – see the figure on the right.
0
2.31. Calculate the field distribution in the simple electrostatic lens that was the subject of
Problem 1.9, provided that the separation of the two field regions is provided by a thin conducting
membrane, with a round hole of radius R.
Hint: You may like to use the fact that the general axially symmetric solution of the Laplace
equation in the oblate ellipsoidal coordinates (59) may be represented in the following variable-
separation form:
p nP n i sinh q nQ n i sinh P n cos ,
n 0
where pn and qn are constants, Pn are the Legendre polynomials (2.169), which are sometimes called the
Legendre functions of the first kind, while Qn are the Legendre functions of the second kind (briefly
mentioned, in a different context, in Sec. 2.8) that may be defined by the following recurrence relations:
1 1 2n 1 n 1
Q 0 ln , Q 1 P1 Q 0 1, Q n 2 Q n1 Q n 2 .
2 1 n n
81 Strictly speaking, this statement, implying negligible quantum-mechanical coherence of the tunneling events, is
correct only if the junction transparency is so low that its effective electric resistance is much higher than the
fundamental quantum unit of resistance, RQ /2e2 6.5 k (see, e.g., QM Sec. 3.2). However, this condition
is satisfied in most experimental tunnel junctions.
Chapter 2 Page 65 of 68
Essential Graduate Physics EM: Classical Electrodynamics
V
2.33. The system discussed in the previous problem is now
generalized as the figure on the right shows. If the voltage V’ applied between
the two bottom electrodes is sufficiently large, electrons can successively
tunnel through two junctions of this system (called the single-electron " island"
transistor), carrying dc current between these electrodes. Neglecting thermal
excitations, calculate the region of voltages V and V’ where such a current is
fully suppressed (Coulomb-blocked). V' tunnel 0
junctions
2.34. Use the charge image method to calculate the full surface charges induced in the plates of a
very broad, voltage-unbiased plane capacitor of thickness D by a point charge q separated from one of
the electrodes by distance d. Suggest at least one alternative method to obtain the same result.
2.35. Use the charge image method to calculate the potential energy of the electrostatic
interaction between a point charge placed in the center of a spherical cavity that had been carved inside
a grounded conductor, and the cavity’s walls. Looking at the result, could it be obtained in a simpler
way (or ways)?
2.37.* Use the spherical inversion expressed by Eq. (198), to develop an iterative method for a
more and more precise calculation of the mutual capacitance between two similar conducting spheres of
radius R, with their centers separated by distance d > 2R.
R1
*
2.38. A conducting sphere of radius R1, carrying an electric charge Q, is R2
placed inside a spherical cavity of radius R2 > R1, carved inside another bulk
conductor. Calculate the electric force exerted on the sphere if its center is
Q
displaced by a small distance << R1, R2 – R1 from that of the cavity – see the
figure on the right.
2.39. Within the simple models of the electric field screening in conductors, discussed in Sec.
2.1, analyze the partial screening of the electric field of a point charge q by a planar conducting film of
constant thickness t << , where is (depending on charge carrier statistics) either the Debye or the
Thomas-Fermi screening length – see, respectively, Eqs. (8) or (10). Assume that the distance d between
the charge and the film is much larger than t.
2.40. Prove the following expansion of the simplest Green’s function (204) into a series over the
Legendre polynomials:
Chapter 2 Page 66 of 68
Essential Graduate Physics EM: Classical Electrodynamics
l
1 1
r
r r'
r
l 0 r
P l (cos ),
where r> is the largest of the two scalars r r 0 and r’ r’ 0, while r< is the smallest of them.
2.41. Use the expansion that was the subject of the previous problem to confirm the analysis, in
Sec. 2.9 of the lecture notes, of the system shown in Fig. 29: a grounded conducting sphere of radius R,
and a point charge q located at distance d > R from its center.
2.42. Suggest a convenient definition of the Green’s function for 2D electrostatic problems, and
calculate it for:
(i) the unlimited free space, and
(ii) the free space above a conducting plane.
Use the latter result to re-solve Problem 21.
2.43. A conducting plane is separated into two parts with a very narrow straight cut, and voltage
V is applied between the resulting half-planes – see the figure below. Use the Green’s function method
to find the distribution of the electrostatic potential and the electric field everywhere in the space.
Compare the result with Eq. (83). In hindsight, could the problem be solved in an even simpler way (or
ways)?
V / 2 V / 2
2.44. Use the last result of Problem 42 and one of the conformal mappings discussed in Sec. 4 to
find one more solution of Problem 18.
2.47. Solve the 2D boundary problem that was discussed in Sec. 11 (Fig. 34) by using:
(i) the finite difference method with the finer square mesh h = a/3, and
(ii) the variable separation method.
Compare the results at the mesh points, and comment.
Chapter 2 Page 67 of 68
Essential Graduate Physics EM: Classical Electrodynamics
This page is
intentionally left
blank
Chapter 2 Page 68 of 68
Essential Graduate Physics EM: Classical Electrodynamics
r'
a Fig. 3.1. Deriving the approximate expression
0
for the electrostatic field of a localized system
of charges at a distant point (r >> r’ ~ a).
Then the positions of all charges of the system satisfy the following condition:
r' r . (3.1)
Using this condition, we can expand the general expression (1.38) for the electrostatic potential (r) of
the system into the Taylor series in small parameter r’. For any function of type f (r – r’), the expansion
may be represented as1
3
f 1 3 2 f
f (r r' ) f (r ) r j' (r ) r' 0 r j'r j'' (r ) r' 0 ... . (3.2)
j 1 r j 2! j , j' 1 r j r j '
Applying this formula to the fraction 1/r – r’ in Eq. (1.38) (i.e. essentially to the free-space Green’s
function), we get the so-called multipole expansion of the electrostatic potential:
1 1 1 3
1 3
r Q 3
r pj r r Qjj ' ... , (3.3)
4 0 r
j j j'
r j 1 2r 5 j,j' 1
whose r-independent parameters are defined as follows:
Q r' d 3 r' , p j r' r j'd 3 r' , Qjj' r' 3r j'r j'' r' 2 jj' d 3 r' . (3.4)
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
(Indeed, the two leading terms of the expansion (2) may be rewritten in the vector form f(r) – r’f(r),
and the gradient of such a spherically-symmetric function f(r) = 1/r is just nrdf/dr, so
1 1 d 1 1 r
r' n r r' 3 , (3.5)
r r' r dr r r r
immediately giving the two first terms of Eq. (3). The proof of the third, quadrupole term in Eq. (3) is
similar but a bit longer, and is left for the reader’s exercise.)
Evidently, the scalar parameter Q in Eqs. (3)-(4) is just the total charge of the system. The
constants pj may be considered as Cartesian components of the following vector:
Electric
p (r' )r'd 3 r' , (3.6) dipole
moment
called the system’s electric dipole moment, and Qjj’ are Cartesian elements of a tensor – system’s electric
quadrupole moment. If Q 0, all higher terms on the right-hand side of Eq. (3), at large distances (1),
are just small corrections to the first one, and in many cases may be ignored. However, the net charge of
many systems is exactly zero, the most important examples being neutral atoms and molecules. For such
neural systems, the second (dipole) term in Eq. (3) is, most frequently, the leading one. Such systems are
called electric dipoles. Due to their importance, let us rewrite the expression for the dipole term in three
other, mathematically equivalent forms:
1 r p 1 p cos 1 pz Electric
d , (3.7) dipole’s
4 0 r 3
4 0 r 2
4 0 x 2 y 2 z 2 3/ 2 potential
that are more convenient for some applications. Here is the angle between the vectors p and r, and in
the last (Cartesian) representation, the z-axis is directed along the vector p. Fig. 2a shows equipotential
surfaces of the dipole field – or rather their cross-sections by any plane in which the vector p resides.
p (a) p (b)
r r
Er
Fig. 3.2. (a) The equipotential surfaces and (b) the electric field lines of a dipole. (Panel (b)
adapted from http://en.wikipedia.org/wiki/Dipole under the GNU Free Documentation License.)
Chapter 3 Page 2 of 28
Essential Graduate Physics EM: Classical Electrodynamics
The simplest example of a system whose field, at large distances, approaches the dipole field (7),
is two equal but opposite point charges (“poles”), +q and –q, with the radius vectors, respectively, r+
and r–:
r ( q) (r r ) (q ) (r r ) . (3.8)
For this system (sometimes called the physical dipole), Eq. (4) yields
p ( q )r ( q )r q (r r ) qa , (3.9)
where a is the vector connecting the points r- and r+. Note that in this case (and indeed for all systems
with Q = 0), the dipole moment does not depend on the choice of the reference frame’s origin.
A less trivial example of a dipole is a conducting sphere of radius R in a uniform external electric
field E0. As a reminder, its field was calculated in Sec. 2.8, and its result is expressed by Eq. (2.176).
The first term in the parentheses of that relation describes just the external field (2.173), so the field of
the sphere itself (i.e. that of the surface charge induced by E0) is given by the second term:
E0 R 3
s cos . (3.10)
r2
Comparing this expression with the second form of Eq. (7), we see that the sphere has an induced dipole
moment
p 4 0 E 0 R 3 . (3.11)
This is an interesting example of a virtually pure dipole field: at all points outside the sphere (r > R), the
field has neither a quadrupole moment nor any higher moments.
Other examples of dipole fields are given by two more systems discussed in Chapter 2 – see Eqs.
(2.215) and (2.219). Those systems, however, do have higher-order multipole moments, so for them, Eq.
(7) gives only the long-distance approximation.
Now returning to the general properties of the dipole field (7), let us calculate its major
characteristics. First of all, we may use Eq. (7) to calculate the electric field of a dipole:
1 r p 1 p cos
E d d 3 . (3.12)
4 0 r 4 0 r 2
This differentiation is easiest in the spherical coordinates, using the well-known expression for the
gradient of a scalar function in these coordinates2 and taking the z-axis parallel to the dipole moment p.
From the last form of Eq. (12), we immediately get
Electric p 1 3r (r p) pr 2
dipole’s Ed 2n r cos n sin . (3.13)
field 4 0 r 3
4 0 r5
Fig. 2b above shows the electric field lines given by Eqs. (13). The most important features of this result
are a faster drop of the field’s magnitude (Ed 1/r3, rather than E 1/r2 for a point charge), and the
change of the signs of its radial component as a function of the polar angle [0, ].
Chapter 3 Page 3 of 28
Essential Graduate Physics EM: Classical Electrodynamics
Next, let us use Eq. (1.55) to calculate the potential energy of interaction between a dipole and
an external electric field. Assuming that this field does not change much at distances of the order of a
(Fig. 1), we may expand its potential ext(r) into the Taylor series, and keep only two leading terms:
The first term is the potential energy the system would have if it were just a point charge. If the net
charge Q is zero, that term disappears, and the leading contribution is due to the dipole moment:
where at the last step, the spatial dependence of the external field Eext(r) was again neglected. This
dependence cannot, however, be ignored at the calculation of the total force exerted by the field on the
dipole (with Q = 0). Indeed, Eqs. (15) shows that if the field is constant, the dipole’s energy is
3 Several calculations of this force, using various models, are described in the QM and SM parts of this series.
Chapter 3 Page 4 of 28
Essential Graduate Physics EM: Classical Electrodynamics
independent of its spatial location and hence the net force is zero. However, if the field has a non-zero
gradient, a total force does appear; for a field-independent dipole,
F U (p E ext ) , (3.18)
where the derivative has to be taken at the dipole’s position (in our notation, at r = 0). If the dipole that
is being moved in a field retains its magnitude and orientation, then the last formula is equivalent to4
F p E ext . (3.19)
Alternatively, the last expression may be obtained similarly to Eq. (14):
Finally, let me add a note on the so-called coarse-grain model of the dipole. The dipole
approximation explored above is asymptotically correct only at large distances, r >> a. However, for
some applications (including the forthcoming discussion of the molecular field effects in Sec. 3) it is
beneficial to have an expression that might be formally used everywhere in space, though maybe
without exact details at r ~ a, giving the correct result for the space average of the electric field,
1
E
VV Ed 3 r , (3.21)
where V is a regularly-shaped volume much larger than a3, for example, a sphere of radius R >> a, with
the dipole at its center. For the field Ed given by Eq. (13), such an average is zero. Indeed, let us
consider the Cartesian components of that vector in a reference frame with the z-axis directed along the
vector p. Due to the axial symmetry of the field, the averages of the components Ex and Ey vanish. Let
us use Eq. (13) to spell out the “vertical” component of the field (parallel to the dipole moment vector):
Ez Ed
p
1
p 4 0 r
2n r p cos n p sin p 3 2 cos 2 sin 2 . (3.22)
3
4 0 r
Integrating this expression over the whole solid angle = 4, at fixed r, using a convenient variable
substitution cos , we get
1
3
p p
4 E z dΩ 2 0 E z sin d 2 0 r 3 2 cos sin sin d 1 d 0 . (3.23)
2 2 2
0 2 0 r 3 1
On the other hand, the exact electric field of an arbitrary charge distribution, with the total
dipole moment p, obeys the following equality:
p 1 4
E(r)d r
3
p, (3.24)
V
3 0 4 0 3
where the integration is over any sphere containing all the charges. (A proof of this formula by using
Eqs. (1.9) and (1.22) is left for the reader’s exercise.) The origin of the difference is illustrated in Fig. 3
on the example of a physical dipole, i.e. a system of two equal but opposite charges – see Eqs. (8)-(9).
4 The equivalence may be proved, for example, by using MA Eq. (11.6) with f = p = const and g = Eext, taking
into account that according to the general Eq. (1.28), Eext = 0.
Chapter 3 Page 5 of 28
Essential Graduate Physics EM: Classical Electrodynamics
The zero average (23) of the dipole field (13) does not take into account the contribution from the region
between the charges where Eq. (13) is not valid, and the field is directed mostly against the dipole vector
(9).
p E
V
q
So, in order to be used as a reasonable coarse-grain model, Eq. (13) may be modified as follows:
1 3r (r p) pr 2 4
E cg p r , (3.25)
4 0 r 5
3
with the average (21) satisfying Eq. (24). Evidently, such a modification does not change the field at
large distances r >> a, i.e. in the region where the expansion (3), and hence Eq. (13), are valid.
where the vector P(r), called the electric polarization, has the physical meaning of the net dipole
moment per unit volume. (Note that by its definition, P(r) is also a “macroscopic” field.)
Now comes a very impressive trick, which is the basis of all the theory of “macroscopic”
electrostatics (and eventually, “macroscopic” electrodynamics). Just as was done at the derivation of Eq.
(5), Eq. (27) may be rewritten in the equivalent form
1 1
d (r )
4 0 P(r' ) ' r r'
d 3 r' , (3.28)
Chapter 3 Page 6 of 28
Essential Graduate Physics EM: Classical Electrodynamics
where ’ means the del operator (in this particular case, the gradient) acting in the “source space” of
vectors r’. The right-hand side of Eq. (28), applied to any volume V limited by a closed surface S, may
be readily integrated by parts to give5
1 Pn (r' ) 1 ' P(r' ) 3
d (r ) r r' d r'
2
d r' . (3.29)
4 0 S
4 0 V
r r'
If the surface does not carry an infinitely dense (-functional) sheet of additional dipoles,6 or it is just
very distant, the first term on the right-hand side is negligible. Now comparing the second term with the
basic equation (1.38) for the electric potential, we see that this term may be interpreted as the field of
certain effective electric charges with density
Effective
charge ef P . (3.30)
density
Figure 4 illustrates the physics of this key relation for a cartoon model of a simple multi-dipole
system: a layer of uniformly distributed two-point-charge units oriented normally to the layer’s surface.
(In this case, P = dP/dx.) One can see that the ef defined by Eq. (30) may be interpreted as the
density of the uncompensated surface charges of polarized elementary dipoles.
P
Fig. 3.4. The spatial distributions of the
0 x polarization and effective charges in a layer of
ef similar elementary dipoles (schematically).
Next, from Sec. 1.2, we already know that Eq. (1.38) is equivalent to the inhomogeneous
Maxwell equation (1.27) for the electric field, so the macroscopic electric field of the dipoles (defined as
Ed = –d, where d is given by Eq. (27)) obeys a similar equation, with the effective charge density
(30).
Now let us consider a more general case when a system, besides the compensated charges of the
dipoles, also has certain stand-alone charges – not parts of the dipoles already taken into account in the
polarization P. As was discussed in Sec. 1.1, if we average this charge over the inter-point-charge
distances, i.e. approximate it with a continuous “macroscopic” density (r), then its macroscopic
5 To prove this (almost evident) formula strictly, it is sufficient to apply the divergence theorem given by MA Eq.
(12.2), to the vector function f = P(r’)/r – r’, in the “source space” of radius-vectors r’.
6 Just like in the case of Eq. (1.9), we may always describe such a dipole sheet using the second term in Eq. (29),
by including a delta-functional part into the polarization distribution P(r’).
Chapter 3 Page 7 of 28
Essential Graduate Physics EM: Classical Electrodynamics
electric field also obeys Eq. (1.27), but with the stand-alone charge density. Due to the linear
superposition principle, for the total macroscopic field E of these charges and dipoles, we may write
1 1
E ef P . (3.31)
0 0
This is already the main result of the “macroscopic” electrostatics. However, it is evidently
tempting (and very convenient for applications) to rewrite Eq. (31) in a different form by carrying the
dipole-related term of this equality over to its left-hand side. The resulting formula is called the
macroscopic Maxwell equation for D:
Maxwell
D , (3.32) equation
for D
where D(r) is a new “macroscopic” field, called the electric displacement (in some older texts, “electric
induction”), defined as7
D 0E P . (3.33) Electric
displacement
The comparison of Eqs. (32) and (1.27) shows that D (or more strictly, the fraction D/0) may be
interpreted as the “would-be electric field” that would be created by stand-alone charges in the absence
of the dipole medium polarization. If should be distinguished from the E participating in Eqs. (31) and
(33), i.e. from the genuine electric field, if averaged over a spatial scale of the order of the distance
between elementary charges and dipoles.
In order to get an even better gut feeling of the fields E and D, let us first rewrite the
macroscopic Maxwell equation (32) in the integral form. Applying the divergence theorem to an
arbitrary volume V limited by surface S, we get the following macroscopic Gauss law:
D d r d 3r Q ,
2 Macroscopic
n (3.34) Gauss law
S V
i.e. the normal component of the electric displacement has to be continuous. Note that a similar
statement for the macroscopic electric field E is generally not valid, because the polarization vector P
may have, and typically does have a leap at a sharp interface (say, due to the different polarizability of
7 Note that according to its definition (33), the dimensionality of D in the SI units is different from that of E. In
contrast, in the Gaussian units, the electric displacement is defined as D = E + 4P, so D = 4 (the relation ef
= –P remains the same as in the SI units), and the dimensionalities of D and E coincide. This coincidence is a
certain perceptional handicap because it is frequently convenient to consider the scalar components of E as
generalized forces, and those of D as generalized coordinates (see Sec. 5 below), and it is somewhat comforting to
have their dimensionalities different, as they are in the SI units.
Chapter 3 Page 8 of 28
Essential Graduate Physics EM: Classical Electrodynamics
the two different dielectrics), providing a surface layer of the effective charges (30) – see again the
example shown in Fig. 4.
E1
Fig. 3.5. Deriving the boundary conditions at an interface
n E2 between two dielectrics, using a Gauss pillbox (shown as
τ a solid-line rectangle) and a circulation contour (dashed-
D1 line rectangle). Here n and are the unit vectors that are,
respectively, normal and tangential to the interface. Note
D2 that due to the leap of polarization, the field lines are
generally “refracted” at the interface – see Fig. 11b for an
example.
However, we still can make an important statement about the behavior of E at the interface.
Indeed, the macroscopic electric fields defined by Eqs. (29) and (31), are evidently still potential ones,
and hence obey the macroscopic Maxwell equation similar to Eq. (1.28):
Macroscopic
Maxwell
equation for E
E 0. (3.36)
Integrating this equality along a narrow contour stretched along the interface (see the dashed rectangle
in Fig. 5), we get
Boundary
condition E 1 E 2 . (3.37)
for E
Note that this condition is compatible with (and may be derived from) the continuity of the macroscopic
electrostatic potential related to the macroscopic field E by the relation similar to Eq. (1.33), E = –,
at each point of the interface: 1 = 2.
In order to see how these boundary conditions work, let us consider the simple problem shown in
Fig. 6. A very broad plane capacitor, with zero voltage between its conducting plates (as may be
enforced, for example, by their connection with an external wire), is partly filled with a material with a
uniform polarization P0,8 oriented normal to the plates. Let us calculate the spatial distribution of the
fields E and D, and also the surface charge density of each conducting plate.
z d2
d1 Fig. 3.6. A simple system whose
P0 analysis requires Eq. (35).
Due to the symmetry of the system, the vectors E and D are both normal to the plates and do not
depend on the position in the capacitor’s plane, so we can limit the fields’ analysis to the calculation of
their z-components E(z) and D(z). In this case, the Maxwell equation (32) is reduced to dD/dz = 0 inside
each layer (but not at their border!), so within each of them, D is constant – say, some D1 in the layer
with P = P0, and certain D2 in the free-space layer, where P = 0. As a result, according to Eq. (33), the
(macroscopic) electric field inside each layer is also constant:
8As will be discussed in the next section, this is a good approximation for the so-called electrets, and also for
hard ferroelectrics in not very high electric fields.
Chapter 3 Page 9 of 28
Essential Graduate Physics EM: Classical Electrodynamics
D1 0 E1 P0 , D2 0 E 2 . (3.38)
Since the voltage between the plates is zero, we may also require the integral of E, taken along a path
connecting the plates, to vanish. This gives us one more relation:
E1 d 1 E 2 d 2 0. (3.39)
Still, the three equations (38)-(39) are insufficient to calculate the four fields in the system (E1,2 and
D1,2). The decisive help comes from the boundary condition (35):
D1 D2 . (3.40)
(Note that it is valid because the layer interface does not carry stand-alone electric charges, even though
it has a polarization surface charge, whose areal density may be calculated by integrating Eq. (30)
across the interface: ef = P0. Note also that in our simple system, Eq. (37) is identically satisfied due to
the system’s symmetry, and hence does not give any additional information.)
Now solving the resulting system of four equations (38)-(40), we readily get
P0 d2 P0 d1 d1
E1 , E2 , D1 D2 D P0 . (3.41)
0 d1 d 2 0 d1 d 2 d1 d 2
The areal densities of the electrode surface charges may now be readily calculated by the integration of
Eq. (32) across each surface:
d1
1 2 D P0 . (3.42)
d1 d 2
Note that due to the spontaneous polarization of the lower layer’s material, the capacitor plates
are charged even in the absence of voltage between them and that this charge is a function of the second
electrode’s position (d2).9 Also notice a substantial similarity between this system (Fig. 6), and the one
whose analysis was the subject of Problem 2.6.
9 This effect is used in most modern microphones. In such a device, the sensed sound wave’s pressure bends a
thin conducting membrane playing the role of one of the capacitor’s plates, and thus modulates the thickness (in
Fig. 6, d2) of the air gap adjacent to the electret layer. This modulation produces proportional variations of the
charges (42), and hence the corresponding electric current flowing between the plates, which is picked up by
readout electronics. According to J. West (who, together with G. Sessler, invented the electret microphone in
1962), currently more than 2 billion of these devices are fabricated each year.
10 In the problem solved at the end of the previous section, the role of such relation was played by the equality P
0
= const.
Chapter 3 Page 10 of 28
Essential Graduate Physics EM: Classical Electrodynamics
(still containing many such dipoles) equals zero: P = 0 at E = 0. Moreover, if the field changes are
sufficiently slow, most materials may be characterized by a unique dependence of P on E. Then using
the Taylor expansion of function P(E), we may argue that in relatively low electric fields the function
should be well approximated by a linear dependence between these two vectors. Such dielectrics are
called linear (or “simple”). In an isotropic media, the coefficient of proportionality should be just a
scalar. In the SI units, this scalar is defined by the following relation:
Electric
susceptibility P e 0 E , (3.43)
with the dimensionless constant e called the electric susceptibility. However, it is much more common
to use, instead of e, another dimensionless parameter,11
Dielectric
constant 1 e , (3.44)
which is sometimes called the “relative electric permittivity”, but much more often, the dielectric
constant. This parameter is very convenient, because combining Eqs. (43) and (44),
P 1 0 E. (3.45)
and then plugging the resulting relation into the general Eq. (33), we get simply
D 0E, or D E, (3.46)
where another popular parameter,12
Electric 0 1 e 0 . (3.47)
permittivity
is called the electric permittivity of the material.13 Table 1 gives the approximate values of the
dielectric constant for several representative materials.
In order to understand the range of these values, let me discuss (briefly and rather superficially14)
the two simplest mechanisms of electric polarization. The first of them is typical for liquids and gases of
polar atoms/molecules, which have their own, spontaneous dipole moments p. (A typical example is the
water molecule H2O, with the negative oxygen ion offset from the line connecting two positive
hydrogen ions, thus producing a spontaneous dipole moment p = ea, with a 0.3810-10m ~ rB.) In the
absence of an external electric field, the orientation of such dipoles may be random, with the average
polarization P = np equal to zero – see the top panel of Fig. 7a.
11 In older physics literature, the dielectric constant is often denoted by the letter r (with the index “r” meaning
“relative”), while in electrical engineering publications, its notation is frequently K.
12 The reader may be perplexed by the use of three different but uniquely related parameters ( , 1 + , and
e e
0) for the description of just one scalar property. Unfortunately, such redundancy is typical for physics, whose
different sub-field communities have different, well-entrenched traditions.
13 In the Gaussian units, is defined by the following relation: P = E, while is defined just as in the SI units,
e e
D = E. Because of that, in the Gaussian units, the constant is dimensionless and equals (1 + 4e). As a result,
Gaussian = (/0)SI , so (e)Gaussian = (e)SI/4, sometimes creating confusion between the numerical values of the
latter parameter – dimensionless in both systems.
14 While I believe this discussion is very useful, it is quantitatively valid only for relatively sparse media, with low
concentration (n << 1/a3) of elementary atomic/molecular dipoles of size scale a. Indeed, in some condensed
materials, with na3 ~ 1, even the notion of the dipole moment p with a single atomic cell is ambiguous.
Chapter 3 Page 11 of 28
Essential Graduate Physics EM: Classical Electrodynamics
Table 3.1. Dielectric constants of a few representative (and/or practically important) dielectrics
Material
Air (at ambient conditions) 1.00054
Teflon (polytetrafluoroethylene, [C2F4]n) 2.1
Silicon dioxide (amorphous) 3.9
Glasses (of various compositions) 3.7–10
Castor oil 4.5
Silicon(a) 11.7
Water (at 100C) 55.3
Water (at 20C) 80.1
Barium titanate (BaTiO3 , at 20C ) ~1,600
(a)
Anisotropic materials, such as silicon crystals, require a susceptibility tensor to give an exact description of the
linear relation of the vectors P and E. However, most important crystals (including Si) are only weakly anisotropic, so
they may be reasonably well characterized with a scalar (angle-average) susceptibility.
(a) (b)
E0
p 0 p 0 Fig. 3.7. Crude cartoons of two
mechanisms of the induced
E0 electrical polarization: (a) a partial
ordering of spontaneous elementary
dipoles, and (b) an elementary
pE
dipole induction. The upper two
panels correspond to E = 0, and
the lower two panels, to E 0.
p E
A relatively weak external field does not change the magnitude of the dipole moments
significantly, but according to Eqs. (15a) and (17), tries to orient them along the field, creating a non-
zero vector average p directed along the vector Em, where Em is the microscopic field at the point of
the dipole’s location – cf. two panels of Fig. 7a. If the field is not two high (pEm << kBT), the induced
average polarization p is proportional to Em. If we write this proportionality relation in the following
traditional form,
p E m , (3.48) Atomic
polarizability
where is called the atomic (or, sometimes, “molecular”) polarizability, this means that is positive. If
the concentration n of such elementary dipoles is low, the contribution of their own fields into the
Chapter 3 Page 12 of 28
Essential Graduate Physics EM: Classical Electrodynamics
microscopic field acting on each dipole is negligible, and we may identify Em with the macroscopic field
E. As a result, the second of Eqs. (27) yields
P n p nE . (3.49)
1 4R 3 n . (3.51)
Let us use this result for a crude estimate of the dielectric constant of air at the so-called ambient
conditions, meaning the normal atmospheric pressure P = 1.013105 Pa and temperature T = 300 K. At
these conditions the molecular density n may be, with a few-percent accuracy, found from the well-
known equation of state of an ideal gas:17 n P /kBT (1.013105)/(1.3810-23300) 2.451025 m-3.
The molecule of the air’s main component, N2, has a van-der-Waals radius18 of 1.5510-10 m. Taking
this radius for the R of our crude model, we get e – 1 1.1510-3. Comparing this number with the
Chapter 3 Page 13 of 28
Essential Graduate Physics EM: Classical Electrodynamics
first line of Table 1, we see that the model gives a surprisingly reasonable result: to get the experimental
value, it is sufficient to decrease the effective R of the sphere by just ~30%, to ~1.210-10 m.19
This result may encourage us to try using Eq. (51) for a larger density n. For example, as a crude
model for a non-polar crystal, let us assume that the conducting spheres form a simple cubic lattice with
the period a = 2R (i.e., the neighboring spheres virtually touch). With this, n = 1/a3 = 1/8R3 and Eq. (44)
yields = 1 + 4/8 2.5. This estimate provides a reasonable semi-qualitative explanation for the
values of listed in a few middle rows of Table 1. However, at such small distances, the electrostatic
dipole-dipole interaction should be already essential, so this simple model cannot even approximately
describe the values of much larger than 1, listed in the last rows of the table.
Such high values may be explained by the so-called molecular field effect: each elementary
dipole is polarized not only by the external field, as Eq. (49) assumes, but by the field of neighboring
dipoles as well. Ottavino-Fabrizio Mossotti in 1850 and (almost 30 years later) Rudolf Clausius
suggested what is now known, rather unfairly, as the Clausius-Mossotti formula,20 which describes this
effect reasonably well in many non-polar materials. In our notation, it reads21
1 n n/0 Clausius-
, so 1 . (3.52) Mossotti
2 3 0 1 n / 3 0 formula
If the dipole density is low in the sense n << 0/, this relation is reduced to Eq. (50) corresponding to
independent dipoles. However, at higher dipole density, and hence e – 1 increase faster and tend
to infinity as the density-polarizability product approaches some critical value nc, equal to 30/ in the
Clausius-Mossotti approximation.22 This means that the zero-polarization state becomes unstable even
in the absence of an external electric field.
This instability is a linear-theory (i.e. low-field) manifestation of a substantially nonlinear effect
– the formation, in some materials, of spontaneous polarization even in the absence of an external
electric field. Such materials are called ferroelectrics, and may be experimentally recognized by the
hysteretic behavior of their polarization as a function of the applied (external) electric field – see Fig. 8.
As the plots show, the polarization of a ferroelectric depends on the applied field’s history. For example,
the direction of its spontaneous remnant polarization PR may be switched by first applying, and then
removing a sufficiently high field (larger than the so-called coercive field EC – see Fig. 8) of the
opposite orientation. The physics of this switching is rather involved; the polarization vector P of a
ferroelectric material is typically constant only within each of the spontaneously formed spatial regions
(called domains), with a typical size of a few tenths of a micron, and different (frequently, opposite)
directions of the vector P in adjacent domains. The change of the applied electric field results not in the
19 As will be discussed in QM Chapter 6, for a hydrogen atom in its ground state, the low-field polarizability may
be calculated analytically: = (9/2)40rB3, corresponding to our metallic-ball model with a close value of the
effective radius: R = (9/2)1/3rB 1.65 rB 0.8710–10 m.
20 Applied to the high-frequency electric field, with replaced by the square of the refraction coefficient at the
field’s frequency (see Chapter 7), this formula is known as the Lorenz-Lorentz relation.
21 The proof of Eq. (52), by using Eq. (24) for the molecular field’s evaluation, is left for the reader’s exercise.
22 The Clausius-Mossotti formula does not give quantitatively correct results for many condensed materials,
notably including ferroelectrics. For a review of modern approaches to the theory of their polarization, see, e.g.,
the paper by R. Resta and D. Vanderbilt in the review collection by K. Rabe, C. Ahn, and J.-M. Triscone (eds.),
Physics of Ferroelectrics: A Modern Perspective, Springer, 2010.
Chapter 3 Page 14 of 28
Essential Graduate Physics EM: Classical Electrodynamics
switching of the direction of P inside each domain, but rather in a shift of the domain walls, resulting in
the change of the average polarization of the sample.
PR
soft
ferroelectric
EC
0 EC E
Fig. 3.8. The average polarization of soft
and hard ferroelectrics as functions of the
PR hard applied electric field (schematically).
ferroelectric
Depending on the ferroelectric’s material, temperature, and the sample’s geometry (a solid
crystal, a ceramic material, or a thin film), the hysteretic loops may be rather different, ranging from a
rather smooth form in the so-called soft ferroelectrics (which include most ferroelectric thin films) to an
almost rectangular form in hard ferroelectrics – see Fig. 8. In low fields, soft ferroelectrics behave
essentially as linear paraelectrics, but with a very high average dielectric constant – see the bottom line
of Table 1 for such a classical material as BaTiO3 (which is a soft ferroelectric at temperatures below Tc
120C, and a paraelectric above this critical temperature). On the other hand, the polarization of a hard
ferroelectric in the fields below its coercive field remains virtually constant, and the analysis of their
electrostatics may be based on the condition P = PR = const – already used in the problem discussed in
the end of the previous section.23 This condition is even more applicable to the so-called electrets –
synthetic polymers with a spontaneous polarization that remains constant even in very high electric
fields.
Some materials exhibit even more complex polarization effects, for example, antiferroelectricity,
helielectricity, and (practically very valuable) piezoelectricity. Unfortunately, I do not have time for a
discussion of these exotic phenomena in this course;24 the main reason I am mentioning them is to
emphasize again that the constitutive relation P = P(E) is material-specific rather than fundamental.
However, most insulators, in practicable fields, behave as linear dielectrics, so the next section will be
committed to the discussion of their electrostatics.
23 Due to this property, hard ferroelectrics, such as the lead zirconate titanate (PZT) and strontium bismuth
tantalite (SBT), with high remnant polarization PR (up to ~1 C/m2), may be used in nonvolatile random-access
memories (dubbed either FRAM or FeRAM) – see, e.g., J. Scott, Ferroelectric Memories, Springer, 2000. In a
cell of such a memory, binary information is stored in the form of one of two possible directions of spontaneous
polarization at E = 0 (see Fig. 8). Unfortunately, the time of spontaneous depolarization of ferroelectric thin films
is typically well below 10 years – the industrial standard for data retention in nonvolatile memories, and this time
may be decreased even more by “fatigue” from the repeated polarization recycling at information recording. Due
to these reasons, the industrial production of FRAM is currently just a tiny fraction of the nonvolatile memory
market, which is dominated by floating-gate memories – see, e.g., Sec. 4.2 below.
24 For detailed coverage of ferroelectrics, I can recommend the encyclopedic monograph by M. Lines and A.
Glass, Principles and Applications of Ferroelectrics and Related Materials, Oxford U. Press, 2001, and the
recent review collection edited by K. Rabe et al., that was cited above.
Chapter 3 Page 15 of 28
Essential Graduate Physics EM: Classical Electrodynamics
(As a reminder, this increase of C by has been already incorporated, without proof, into some
estimates made in Secs. 2.1 and 2.2, to make them realistic.)
If a linear dielectric is nonuniform, the situation is more complex. For example, let us consider
the case of a sharp interface between two otherwise uniform dielectrics, free of stand-alone charges. In
this case, we still may use Eq. (37) for the tangential component of the macroscopic electric field, and
also Eq. (36), with Dn = En, for its normal component, getting
1 Boundary
E n 1 E n 2 , i.e. 1 2 2 . (3.56) condition
n n for En
Let us apply these boundary conditions, first of all, to consider how carving a slit of some width
d and a much smaller thickness t << d from inside a dielectric, changes an initially uniform electric field
E0, depending on its orientation – see Fig. 9.
Chapter 3 Page 16 of 28
Essential Graduate Physics EM: Classical Electrodynamics
B
E0 A D D0 E E0
D 0 0 E 0 E D / 0 E 0 D 0E D0 / Fig. 3.9. Fields inside
two narrow slits cut in
a linear dielectric.
First of all, intuition tells us that regardless of its orientation, a slit cannot change the field far
from it; moreover, at t 0, it cannot modify substantially even the field right outside its “major”
(broader) surfaces. This conclusion may be supported either by direct calculations (see, e.g., the problem
illustrated by Fig. 11 below), or by energy arguments: at t << d, any potential energy decrease due to the
field change inside the slit’s volume (proportional to td) cannot compensate its increase in the outer
volume proportional to d2. However, it may induce some local field changes – inside the slit, and even
outside it, close to its “minor” surfaces.
To calculate the inner field for case A, with the slit’s plane normal to the applied field, we may
apply Eq. (56) to its major surfaces (shown horizontal), to prove that the vector D should be continuous.
But according to Eq. (46), this means that in the free space inside the slit, the electric field should equal
D/0, and hence be times higher than the field E0 = D/0 far from the slit. This field, and hence D,
may be measured by a sensor placed inside the gap, so the electric displacement is not an entirely
mathematical construct.25 On the contrary, for case B, with the slit’s plane parallel to the initial field,
we may apply Eq. (37) to the major (now, vertical) interfaces of the slit, to see that now the electric field
E is continuous, while the electric displacement D = 0E inside the gap is a factor of lower than its
value in the dielectric. (Similarly to case A, any perturbations of the field uniformity, caused by the
compliance with Eq. (56) at the minor surfaces, settle down at distances ~t from them.)
For other problems with piecewise-constant , with more complex geometries, we may need to
apply the methods studied in Chapter 2. In particular, in the simplest cases, we can select such a set of
orthogonal coordinates that the electrostatic potential depends on just one of them. Consider, for
example, two types of filling a plane capacitor with two different dielectrics – see Fig. 10.
(a) (b)
d1 1
d 1 2
d2 2 Fig. 3.10. Plane capacitors filled
with two different dielectrics.
In case (a), the voltage V between the electrodes is the same for each part of the capacitor, telling
us that at least far from the dielectric interface, the electric field is vertical, uniform, and constant (E =
V/d). Hence the boundary condition (37) is satisfied even if such a distribution is valid near the surface
25 Superficially, this result violates the boundary condition (37) at the vertical (“minor”) surfaces of the gap. This
apparent contradiction is resolved by the fact the thin slit can deform the field both inside and outside it, at
distances of the order of t around these interfaces, but not far beyond them, so the above relations for E and D are
valid at most of the slit area.
Chapter 3 Page 17 of 28
Essential Graduate Physics EM: Classical Electrodynamics
as well, i.e. at any point of the system. The only effect of different values of in the two parts is that the
electric displacement D = E and hence electrodes’ surface charge density = D are different in them.
Thus we can calculate the electrode charges Q1,2 of the two parts independently, and then add up the
results to get the total mutual capacitance
Q1 Q2 1
C 1 A1 2 A2 . (3.57)
V d
Note that this formula may be interpreted as the total capacitance of two separate lumped capacitors
connected (by wires) in parallel. This is natural, because we may cut the system along the dielectric
interface, without any effect on the fields in either part, and then connect the corresponding electrodes
by external wires, again without any effect on the system – besides very close vicinities of the
capacitor’s edges, where the fringe
Case (b) may be analyzed just as in the problem illustrated by Fig. 6, by applying Eq. (34) to a
Gaussian pillbox with one lid inside the (for example) bottom electrode, and the other lid inside any of
the layers. As a result, we see that D anywhere inside the system should be equal to the surface charge
density of the electrode, i.e. constant. Hence, according to Eq. (46), the electric field E inside each
dielectric layer is also constant: in the top layer, it is E1 = D1/1 = /1, while in bottom layer, E2 = D2/2
= /2. Integrating the field E across the whole capacitor, we get
d1 d 2
d d
V E ( z )dz E d
0
1 1 E 2 d 2 1 2 ,
1 2
(3.58)
Chapter 3 Page 18 of 28
Essential Graduate Physics EM: Classical Electrodynamics
r R al r l P l (cos ) . (3.61)
l 1
Now, spelling out the boundary conditions (37) and (56) at r = R, we see that for all coefficients al and
bl with l 2, we get homogeneous linear equations (just like for the conducting sphere discussed in Sec.
2.8) that have only trivial solutions. Hence, all these terms may be dropped, while for the only surviving
terms with l = 1, proportional to the Legendre polynomial P1(cos) cos, we get two equations:
2b1 b1
E0 κa1 , E0 R a1 R. (3.62)
R3 R2
Solving this simple system of linear equations for a1 and b1, and plugging the result into Eqs. (60) and
(61), we get the final solution of the problem:
1 R3 3
r R E 0 r cos , rR E0 r cos . (3.63)
2 r 2 2
(a) (b)
z E0
R
0
Fig. 3.11. A dielectric sphere in an initially uniform electric field: (a) the problem, and (b) the
equipotential surfaces, as given by Eq. (63), for = 3.
Figure 11b shows the equipotential surfaces given by this solution, for a particular value of the
dielectric constant . Note that according to Eq. (62), at r R the dielectric sphere, just as the
conducting sphere in a similar problem, produces (on top of the uniform external field) a pure dipole
field, with the dipole moment
1 1 4 3
p 4R 3 0 E 0 3V 0E0 , where V R . (3.64)
2 2 3
This is an evident generalization of Eq. (11), to which Eq. (64) tends at . By the way, this
property is common: for their electrostatic properties, conductors may be adequately described as
dielectrics with .
Another remarkable feature of Eqs. (63) is that the electric field and polarization inside the
sphere are uniform, with R-independent values
Chapter 3 Page 19 of 28
Essential Graduate Physics EM: Classical Electrodynamics
3 3 1
E E0 , D 0 E 0 E0 , P D 0 E 3 0 E0 . (3.65)
2 2 2
In the limit 1 (for example, the “sphere made of free space”, i.e. no sphere at all), the electric field
inside it naturally tends to the external one, and its polarization vanishes. In the opposite limit ,
the electric field inside the sphere vanishes. Curiously enough, in this limit the electric displacement
inside the sphere remains finite: D 30E0.
More complex problems with piecewise-uniform dielectrics also may be addressed by the
methods discussed in Chapter 2, and hopefully, the reader will be able to use them to solve a few such
problems offered in Sec. 6, on their own. Let me discuss just one of such problems because it exhibits a
new feature of the charge image method that was discussed in Secs 2.9 (and is the basis of Green’s
function approach – see Sec. 2.10). Consider the system shown in Fig. 12: a point charge near a
dielectric half-space; it obviously parallels the system discussed in Sec. 2.9 – see Fig. 2.26.
z
q, q" this point “sees”
charges
d q and q’
0
d
this point “sees” Fig. 3.12. Charge images for a dielectric half-space.
q' charge q” alone
As for the case of a conducting half-space, the Laplace equation for the electrostatic potential in
the upper half-space z > 0 (besides the charge point = 0, z = d) may be satisfied using a single image
charge q’ at the point with = 0 and z = – d, but now q’ may differ from (–q). In addition, in contrast to
the case analyzed in Sec. 2.9, we should also calculate the field inside the dielectric (at z 0). This field
cannot be contributed by the image charge q’, because that would give a potential divergence at its
location. Thus, in the dielectric-filled half-space we should try to use the real point source only, but with
a re-normalized charge q” rather than the genuine charge q – see Fig. 12. As a result, we may look for
the potential distribution in the form
q q'
1/ 2
, for z 0,
( , z)
1
( z d )
2 2
1/ 2
(z d )2
2
(3.66)
4 0 q''
, for z 0,
2 ( z d ) 2 1/ 2
at this stage of solution, with unknown q’ and q”. Plugging this equality into the boundary conditions
(37) and (56) at z = 0 (with /n = /z), we see that they are indeed satisfied (so Eq. (66) does express
the solution of the boundary problem), provided that the effective charges q’ and q’’ obey the following
relations:
Chapter 3 Page 20 of 28
Essential Graduate Physics EM: Classical Electrodynamics
( D) ( D) ( ) D , (3.70)
we may rewrite Eq. (69) as
1 1
U D d 3 r ( ) D d 3 r . (3.71)
2 2
Chapter 3 Page 21 of 28
Essential Graduate Physics EM: Classical Electrodynamics
The divergence theorem, applied to the first term on the right-hand side, reduces it to a surface integral
of Dn. (As a reminder, in Eq. (1.63) the integral was of ()n En.) If the surface of the volume we
are considering is sufficiently far, this surface integral vanishes. On the other hand, the gradient in the
second term of Eq. (71) is just (minus) field E, so it gives
1 1
U
2 E D d 3 r E (r ) (r ) E (r ) d 3 r 0 (r ) E 2 (r ) d 3 r .
2 2
(3.72)
This expression is a natural generalization of Eq. (1.65), and shows that we can, as we did in free space,
represent the electrostatic energy in a local form:27
Field
1 D2
U u (r )d 3 r , with u ED E2 . (3.73) energy in
2 2 2 a linear
dielectric
As a sanity check, in the trivial case = 0 (i.e. = 1), this result is reduced to Eq. (1.65).
Of course, Eq. (73) is valid only for linear dielectrics, because our starting point, Eq. (1.60), is
only valid if is proportional to . To make our calculation more general, we should intercept the
calculations of Sec. 1.3 at an earlier stage, at which this proportionality had not yet been used. For
example, the first of Eqs. (1.56) may be rewritten, in the continuous form, as
U (r ) (r )d 3r , (3.74)
where the symbol means a small variation of the function – e.g., its change in time, sufficiently slow to
ignore the relativistic and magnetic-field effects. Applying such variation to Eq. (32), and plugging the
resulting relation = D into Eq. (74), we get
U D d 3 r . (3.75)
(Note that in contrast to Eq. (69), this expression does not have the front factor ½.) Now repeating the
same calculations as in the linear case, for the energy density’s variation we get a remarkably simple
(and general!) formula,
3 Energy
u E D E j D j , (3.76) density’s
j 1 variation
where the last expression uses the Cartesian components of the vectors E and D. This is as far as we can
go for the general dependence D(E). If the dependence is linear and isotropic, as in Eq. (46), then D =
E and
E2
u E E . (3.77)
2
The integration of this expression over the whole variation, from the field equal to zero to a certain final
distribution E(r), brings us back to Eq. (73).
An important role of Eq. (76), in its last form, is to indicate that from the point of view of
analytical mechanics, the Cartesian coordinates of E may be interpreted as generalized forces, and those
27 In the Gaussian units, each of the last three expressions should be divided by 4.
Chapter 3 Page 22 of 28
Essential Graduate Physics EM: Classical Electrodynamics
of D as generalized coordinates of the field’s effect on a unit volume of the dielectric. This allows one,
in particular, to form the proper Gibbs potential energy28 of a system with an electric field E(r) fixed, at
every point, by some external source:
Gibbs
potential
energy U G u G r d 3 r , u G r u r Er Dr . (3.78)
V
The essence of this notion is that if the generalized external force (in our case, E) is fixed, the stable
equilibrium of the system corresponds to the minimum of UG, rather than of the potential energy U as
such – in our case, that of the field in our system.
As the simplest illustration of this important concept, let us consider a very long cylinder (with
an arbitrary cross-section shape), made of a uniform linear dielectric, placed into a uniform external
electric field parallel to the cylinder’s axis – see Fig. 13.
E ext
D?
For this simple problem, the equilibrium value of D inside the cylinder may be, of course, readily
found without any appeal to energies. Indeed, the solution of the Laplace equation inside the cylinder,
with the boundary condition (37) is evident: E(r) = Eext, and so Eq. (46) immediately yields D(r) =
Eext. One may wonder why the minimum of the potential energy U, given by Eq. (73) in its last form,
U D2
, (3.79)
V 2
corresponds to a different (zero) value of D, but let us recall that Eq. (73) was derived for the case when
the electric field is created by the stand-alone charges in the system under consideration. If it is created
by external sources, we have to use the Gibbs potential energy (78) instead. For our current uniform
case, this energy per unit volume of the cylinder is
3 D2
UG U D2
E D EjDj ,
j
ED (3.80)
V V 2
j 1 2
and its minimum as a function of every Cartesian component of D corresponds to the correct value of
the displacement: Dj = Ej, i.e. to D = E = Eext. So, the systems’ equilibrium indeed corresponds to the
minimum of the Gibbs potential energy (78) rather than of the energy (73).
28 See, e.g., CM Sec. 1.4, in particular Eq. (1.41). Note that as Eq. (78) clearly illustrates, once again, that the
difference between the potential energies UG and U, usually discussed in courses of thermodynamics and
statistical physics as the difference between the Gibbs and Helmholtz free energies (see, e.g., SM 1.4), is much
more general than the effects of random thermal motion addressed by these disciplines.
Chapter 3 Page 23 of 28
Essential Graduate Physics EM: Classical Electrodynamics
Now note that Eq. (80), at this equilibrium point (only!), may be rewritten as
UG U D2 D D2
ED D , (3.81)
V V 2 2
i.e. formally coincides with Eq. (79), besides the (perhaps, somewhat counter-intuitive) opposite sign. A
similar but more general relation (not limited to linear dielectrics and uniform fields) may be obtained
by taking the variation of the uG expressed by Eq. (78), and then using Eq. (76):
In order to see how this expression works, let us plug D from Eq. (33):
0E2
u G 0 E P E P E . (3.83)
2
So far, this relation is general. In the particular case when the polarization P is field-independent,
we may integrate Eq. (83) over the full electric field’s variation, say from 0 to some finite value E,
getting
0E2
uG PE . (3.84)
2
Again, the Gibbs energy is relevant only if E is dominated by an external field Eext independent of the
orientation of P. If, in addition, P(r) 0 only in some finite volume V, we may integrate Eq. (84) over
that volume, getting
U G p E ext const, with p P (r )d 3 r , (3.85)
V
where the “const” means the terms independent of p. In this expression, we may readily recognize Eq.
(15a) for an electric dipole p of a fixed magnitude, which was obtained in Sec. 1 in a different way. This
comparison illustrates again that UG is nothing mysterious; it is just the relevant part of the potential
energy of the system in a fixed external field, including the energy of its interaction with the field.
Finally, in the other important case of a linear dielectric, when according to Eqs. (45) and (47), P
= ( - 0)E, the similar integration of the general Eq. (83) over the field yields the additional factor ½:
1
UG
2V P E ext d 3 r const . (3.86)
This expression may be very convenient for analyses of the forces exerted by electric fields on linear
dielectric media – see, for, example, a few exercises on this topic, offered at the end of this chapter.
3.1. Prove Eqs. (3)-(4), starting from Eqs. (1.38) and (3.2).
3.2. A thin ring of radius R is charged with a constant linear density . Calculate the exact
electrostatic potential distribution along the symmetry axis of the ring, and prove that at large distances,
r >> R, the three leading terms of its multipole expansion are indeed correctly described by Eqs. (3)-(4).
Chapter 3 Page 24 of 28
Essential Graduate Physics EM: Classical Electrodynamics
3.3. In suitable reference frames, calculate the dipole and quadrupole moments of the following
systems (see the figures below):
(i) four point charges of the same magnitude but alternating signs, placed in the corners of a
square;
(ii) a similar system but with a pair charge sign alternation; and
(iii) a point charge in the center of a thin ring carrying a similar but opposite charge uniformly
distributed along its circumference.
i q a q ii q a q iii
R
a a a a Q
Q
q a q q a q
3.4. Calculate the dipole and quadrupole moments of a thin spherical shell of radius R, carrying
an electric charge with the areal density = 0cos. Discuss the relation between the results and the
solution of Problem 2.28.
3.5. For a regular cubic lattice of similarly oriented identical dipoles, calculate the electric field it
creates at the location of each dipole.
3.6. Without carrying out an exact calculation, can you predict the spatial dependence of the
interaction between various electric multipoles, including point charges (in this context, frequently
called electric monopoles), dipoles, and quadrupoles? Based on these predictions, what is the functional
dependence of the interaction between homonuclear diatomic molecules such as H2, N2, O2, etc., on the
distance between them when the distance is much larger than the molecular size?
3.7. Two similar electric dipoles, of a fixed magnitude p, located at a fixed distance r from each
other, are free to change their directions. What stable equilibrium position(s) they may take as a result of
their electrostatic interaction?
R
0
3.9. Calculate the net charge Q induced in a grounded conducting sphere r
of radius R by a dipole p located at point r outside the sphere – see the figure on 0
p
the right.
Chapter 3 Page 25 of 28
Essential Graduate Physics EM: Classical Electrodynamics
3.10. Use two different approaches to calculate the energy of interaction between a grounded
conductor and an electric dipole p placed in the center of a spherical cavity of radius R, carved in the
conductor.
3.11. A plane separating two halves of otherwise free space is densely and uniformly (with a
constant areal density n) filled with electric dipoles, with similar moments p oriented normally to the
plane.
(i) Use two different approaches to calculate the electrostatic potential at distances d >> 1/n1/2 on
both sides of the plane.
(ii) Give a physical interpretation of your result.
(iii) Use the result to calculate the potential distribution created in space by a spherical surface of
radius R, densely and uniformly filled with radially oriented dipoles.
3.13. A sphere of radius R is made of a material with a uniform spontaneous polarization P0.
Calculate the electric field everywhere in space – both inside and outside the sphere, and compare the
result for the internal field with Eq. (24).
3.14. Calculate the electric field at the center of a cube made of a material with the uniform
spontaneous polarization P0 of arbitrary orientation.
3.15. Derive the Clausius-Mossotti formula (52) by combining Eq. (24) with the result of the
solution of Problem 5.
3.16. Stand-alone charge Q is distributed, in some way, within the volume of a body made of a
uniform linear dielectric with a dielectric constant . Calculate the total polarization charge Qef residing
on the surface of the body, provided that it is surrounded by free space.
3.17. In two separate experiments, a thin plane sheet of a linear dielectric with = const is
placed into a uniform external electric field E0, in two different ways:
(i) with the sheet’s surfaces parallel to the electric field, and
(ii) with its surfaces normal to the field.
For each case, find the electric field E, the electric displacement D, and the polarization P inside
the dielectric, sufficiently far from the sheet’s edges.
3.18. A fixed dipole p is placed in the center of a spherical cavity of radius R, carved inside a
uniform linear dielectric. Calculate the electric field distribution everywhere in the system.
Hint: You may start with the assumption that the field at r > R has a distribution typical for a
dipole. However, be ready for surprises.
Chapter 3 Page 26 of 28
Essential Graduate Physics EM: Classical Electrodynamics
,
3.19. A spherical capacitor (see the figure on the right) is filled with a
linear dielectric whose permittivity depends on the spherical angles and ,
but not on the distance r from the system’s center. Derive an explicit a
expression for its capacitance C.
ba
3.20. A spherical capacitor similar to that considered in the previous problem is now filled with a
linear dielectric whose permittivity depends only on the distance from the center. Obtain an explicit
expression for its capacitance, and spell it out for the particular case (r) = (a)(r/a)n.
E0
3.21. A uniform electric field E0 has been created (by distant external
sources) inside a uniform linear dielectric. Find the electric field’s change R
created by carving out a cavity in the shape of a round cylinder of radius R,
with its axis normal to the external field – see the figure on the right.
3.22. Similar small spherical particles, made of a linear dielectric, are dispersed in free space
with a low concentration n << 1/R3, where R is the particle's radius. Calculate the average dielectric
constant of such a medium. Compare the result with the apparent but wrong answer
1 1 nV , (WRONG!)
(where is the dielectric constant of the particle's material and V = (4/3)R3 is its volume), and explain
the origin of the difference.
3.23. A straight thin filament, uniformly charged with linear density , is positioned parallel to
the plane separating two uniform linear dielectrics, at a distance d from it. Calculate the electric
potential’s distribution everywhere in the system.
3.24. A point charge q is located at a distance d > R from the center of a sphere of radius R, made
of a uniform linear dielectric with permittivity .
(i) Calculate the electrostatic potential’s distribution in all the space, for an arbitrary ratio d/R.
(ii) For large d/R, use two different approaches to calculate the interaction force and the energy
of interaction between the sphere and the charge, in the first nonzero approximation in R/d << 1.
Hint: Task (i) cannot be carried out using the method of charge images, so you may like to use
the expansion of the function 1/ r – r’ in the series over the Legendre polynomials, whose proof was
the subject of Problem 2.40.
Chapter 3 Page 27 of 28
Essential Graduate Physics EM: Classical Electrodynamics
3.26. Discuss the physical nature of Eq. (76). Apply your conclusions to a material with a fixed
(field-independent) polarization P0(r), and calculate the electric field’s energy of a uniformly polarized
sphere (see Problem 13 above).
3.27. Use Eqs. (73) and (82) to calculate the force of attraction of a plane capacitor’s plates (per
unit area), for two cases:
(i) the capacitor is charged to voltage V, and then disconnected from the battery,29 and
(ii) the capacitor remains connected to the battery.
3.29. For each of the two capacitors shown in Fig. 10, calculate the electric force exerted on the
interface between two different dielectrics, in terms of the fields in the system.
R
3.30. One half of a conducting sphere of radius R, carrying electric 0 Q
charge Q, is submerged into a half-space filled with a linear dielectric with
0
permittivity – see the figure on the right. Calculate the electric force
exerted on the sphere by the dielectric.
29 “Battery” is a common if misleading term for what is usually a single galvanic element. (The last term stems
from the name of Luigi Galvani, a pioneer of electric current studies. Another term derived from his name is the
galvanic connection, meaning a direct connection of two conductors, enabling a dc current flow – see the next
chapter.) The term “battery” had to be, in all fairness, reserved for the connection of several galvanic elements in
series – as was pioneered in 1800 by L. Galvani’s friend Alexander Volta.
Chapter 3 Page 28 of 28
Essential Graduate Physics EM: Classical Electrodynamics
Chapter 4. DC Currents
The goal of this chapter is to discuss the distribution of stationary (“dc”) currents in conducting samples
and their “global” characteristics such as resistance. In the most important case of linear (“Ohmic”)
conductivity, the current distribution is governed by the same Laplace and Poisson equations whose
solution methods were discussed in detail in the previous chapters. Because of that, we can piggyback
on most approaches discussed earlier, enabling me to keep this chapter rather brief.
Q Q t Q
E dc current
I (t ) 0 I const source
Q Q t Q
I
Fig. 4.1. Two oppositely charged conductors: (a) in the electrostatic situation, (b) at the charge
relaxation through an additional narrow conductor (“wire”), and (c) in a system sustaining a dc current I.
Now let us connect the two conductors with a wire – a thin, elongated conductor (Fig. 1b). Then
the electric field causes the motion of charge carriers in the wire, from the conductor with a higher
electrostatic potential toward that with lower potential, until the potentials equilibrate. Such a process is
called charge relaxation. The main equation governing this process may be obtained from the
fundamental experimental fact (already mentioned in Sec. 1.1) that electric charges cannot appear or
disappear – though opposite charges may recombine with the conservation of the net charge. As a result,
the charge Q in a conductor may change only due to the electric current I through the wire:
dQ
I t ; (4.1)
dt
this relation may be understood as the definition of the current.1
1Just as a (hopefully, unnecessary :-) reminder, in the SI units the current is measured in amperes (A). In legal
metrology, the ampere (rather than the coulomb, which is defined as 1C = 1A 1s) is a primary unit. (Its formal
definition will be discussed in the next chapter.) In the Gaussian units, Eq. (1) remains the same, so the current’s
unit is the statcoulomb per second – the so-called statampere.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
Let us express Eq. (1) in a differential form, introducing the notion of the current density j(r).
This vector may be defined via the following relation for the elementary current dI crossing an
elementary area dA (Fig. 2):
dI jdA cos ( j cos )dA j n dA , (4.2)
where is the angle between the direction normal to the surface and the charge carrier motion direction,
which is taken for the direction of the vector j.
dA cos dA
dI
j
Fig. 4.2. The current density vector j.
where V is an arbitrary but stationary volume limited by the closed surface S. Applying to this volume
the same divergence theorem as was repeatedly used in previous chapters, we get
t
V
j d 3 r 0 .
(4.4)
This relation acquires an even simpler form in the particular but important case of dc electric
circuits (Fig. 3) – the systems that may be fairly represented as direct (“galvanic”) connections of
components of two types:
2 Similar differential relations are valid for the density of any conserved quantity, for example for mass in
classical dynamics (see, e.g., CM Sec. 8.3), and for the probability, as it is defined in statistical physics (SM Sec.
5.6) and in quantum mechanics (QM Sec. 1.4).
Chapter 4 Page 2 of 16
Essential Graduate Physics EM: Classical Electrodynamics
(i) relatively-small-size (lumped) circuit elements, meaning either a passive resistor, or a current
source, etc. – generally, any “black box” with two or more terminals, and
(ii) perfectly conducting wires, with a negligible drop of the electrostatic potential along them,
that are galvanically connected at certain points called nodes (or “junctions”).
“circuit
element” “node”
“loop”
“wire”
Fig. 4.3. A typical system obeying Kirchhoff
laws.
In the standard circuit theory, the electric charges of the nodes are considered negligible,3 and we
may integrate Eq. (6) over the closed surface drawn around any node to get a simple equality
I j
j 0, (4.7a)
where the summation is over all the wires (numbered with index j) connected in the node. On the other
hand, according to its definition (2.25), the voltage Vk across each circuit element may be represented as
the difference of the electrostatic potentials of the adjacent nodes, Vk = k – k-1. Summing such
differences around any closed loop of the circuit (Fig. 3), we get all terms canceled, so
V
k
k 0. (4.7b)
These relations are called, respectively, the 1st and 2nd Kirchhoff laws4 – or sometimes the node
rule (7a) and the loop rule (7b). They may seem elementary, and their genuine power is in the
mathematical fact that any set of Eqs. (7) covering every node and every circuit element of the system at
least once, gives a system of equations sufficient for the calculation of all currents and voltages in it –
provided that the relation between the current and voltage is known for each circuit element.
It is almost evident that in the absence of current sources, the system of equations (7) has only
the trivial solution: Ij = 0, Vk = 0 – with the exotic exception of superconductivity, to be discussed in
Sec. 6.3. The current sources that allow non-zero current flows may be described by their electromotive
forces (e.m.f.) Vk, having the dimensionality of voltage, which have to be taken into account in the
corresponding terms Vk of the sum (7b). Let me hope that the reader has some experience of using Eqs.
(7) for analyses of simple circuits – say, consisting of several resistors and batteries, so I can save our
time by skipping their discussion. Still, due to their practical importance, I would recommend the reader
to carry out a self-test by solving a couple of problems offered at the beginning of Sec. 6.
3 In many cases, the charge accumulation/relaxation may be described without an explicit violation of Eq. (7a),
just by adding other circuit elements, lumped capacitors (see Fig. 2.5 and its discussion), to the circuit under
analysis. The resulting circuit may be used to describe not only the transient processes but also periodic ac
currents. However, it is convenient for me to postpone the discussion of such ac circuits until Chapter 6, where
one more circuit element type, lumped inductances, will be introduced.
4 Named after Gustav Kirchhoff (1824-1887) – who also suggested the differential form (8) of the Ohm law.
Chapter 4 Page 3 of 16
Essential Graduate Physics EM: Classical Electrodynamics
where is a constant called the Ohmic conductivity (or just the “conductivity” for short).5 Though the
Ohm law (discovered, in its simpler form, by Georg Simon Ohm in 1827) is one of constitutive rather
than fundamental relations, and is approximate for any conducting medium, we can argue that if:
(i) the medium carries no current at E = 0 (mind superconductors!),
(ii) the medium is isotropic or virtually isotropic (a notable exception: some organic conductors),
(iii) the mean free path l of the current carriers (the notion to be discussed in detail in SM Ch. 6)
in this medium is much smaller than the characteristic scale a of the spatial variations of j and E,
then the law may be viewed as the leading, linear term of the Taylor expansion of the local relation j(E),
and thus is general for relatively low fields.
Table 1 gives approximate experimental values of for some representative (and/or practically
important) materials. Note that the range of these values is very broad, even without going to such
extremes as very pure metallic crystals at very low temperatures, where may reach ~1012 S/m.
Table 4.1. Ohmic dc conductivities for some materials at 20C.
Material (S/m)
Teflon (PTFE, [C2F4]n) 10-22-10-24
Silicon dioxide 10-16-10-19
Various glasses 10-10-10-14
Deionized water ~10-6
Seawater 5
16 -3
Silicon n-doped to 10 cm 2.5102
Silicon n-doped to 1019cm-3 1.6104
Silicon p-doped to 1019cm-3 1.1104
Nichrome (alloy 80% Ni + 20% Cr) 0.9106
Aluminum 3.8107
Copper 6.0107
Zinc crystal along a-axis 1.65107
Zinc crystal along c-axis 1.72107
5 In SI units, the conductivity is measured in S/m, where one siemens (S) is the reciprocal of the ohm: 1S (1)-1
1A/1V. The constant reciprocal to conductivity, 1/, is called resistivity and is commonly denoted by the letter
. I will, however, try to avoid using this notion, because in these notes this letter is already overused.
Chapter 4 Page 4 of 16
Essential Graduate Physics EM: Classical Electrodynamics
In order to get a better feeling of what these values mean, let us consider a very simple system
(Fig. 4): a plane capacitor of area A >> d2, filled with a material that has not only a dielectric constant ,
but also some Ohmic conductivity , with much more conductive electrodes.
z
V Q
d
j, E ,
0 0
Q Fig. 4.4. A “leaky” plane capacitor.
Assuming that these properties are compatible with each other,6 we may assume that the
distribution of the electric potential (not too close to the capacitor’s edges) still obeys Eq. (2.39), so the
electric field is normal to the electrode surfaces and uniform, with E = V/d. Then, according to Eq. (6),
the current density is also uniform, j = E = V/d. From here, the total current between the plates is
V
I jA EA A. (4.9)
d
On the other hand, from Eqs. (2.26) and (3.45), the instantaneous value of the total charge of the top
electrode is Q = CV = (0A/d)V. Plugging these relations into Eq. (1), we see that the speed of charge
(and voltage) relaxation is independent of the geometric parameters A and d of the capacitor:
dV V 0
, with r , (4.10)
dt r
so the relaxation time constant r may be used to characterize the gap-filling material as such.
As we already know (see Table 3.1), for most practical materials the dielectric constant is
within one order of magnitude from 10, so the numerator in the second of Eqs. (10) is of the order of 10-
10
(SI units). As a result, according to Table 1, the charge relaxation time ranges from ~1014s (more than
a million years!) for the best insulators like Teflon (polytetrafluoroethylene, PTFE),7 to ~10-18s for the
least resistive metals. What is the physics behind such a huge range of , and why, for some materials,
Table 1 gives them with such a large uncertainty? As in Chapters 2 and 3, in this course, I have time
only for a brief, admittedly superficial discussion of these issues.8
If the charge carriers move almost as classical particles (e.g., in plasmas or non-degenerate
semiconductors), a very reasonable description of the conductivity is given by the famous Drude
formula.9 In his picture, due to a weak electric field, the charge carriers are accelerated in its direction
(on top of their random motion in all directions, with the average velocity vector equal to zero):
dv q
E, (4.11)
dt m
and as a result, their velocity acquires the average value
Chapter 4 Page 5 of 16
Essential Graduate Physics EM: Classical Electrodynamics
dv q
v E , (4.12)
dt m
where the phenomenological parameter = l/v (not to be confused with r!) may be understood as the
average time since the last scattering event. From here, the current density:10
q 2 n q 2 n
j qnv E, i.e. . (4.13a)
m m
Drude
(Notice the independence of of the charge sign.) Another form of the same result, more popular in the formula:
two
physics of semiconductors, is versions
q 2 n , with , (4.13b)
m
where the parameter , defined by the relation v E, is called the charge carrier mobility.
Most good conductors (e.g., metals) are essentially degenerate Fermi gases (or liquids), in which
the average thermal energy of a particle, kBT is much lower than the Fermi energy F. In this case, a
quantum theory is needed for the calculation of . Such a theory was developed by A. Sommerfeld in
1927 (and is sometimes called the Drude-Sommerfeld model). I have no time to discuss it in this
course,11 and here will only notice that for a nearly ideal, isotropic Fermi gas the result is reduced to Eq.
(13), with a certain effective value of , so it may be used for estimates of , with due respect to the
quantum theory of scattering. In a typical metal, n is very high (~1023 cm-3) and is fixed by the atomic
structure, so the sample quality may only affect via the scattering time .
At room temperature, the scattering of electrons by thermally-excited lattice vibrations
(phonons) dominates, so and are high but finite, and do not change much from one sample to
another. (Hence the relatively accurate values given for metals in Table 1.) On the other hand, at T 0,
quantum mechanics says a perfect crystal should not exhibit scattering at all, and its conductivity should
be infinite. In practice, this is never true (for one, due to electron scattering from imperfect boundaries
of finite-size samples), and the effective conductivity is infinite (or practically infinite, at least above
the largest measurable values ~1020 S/m) only in superconductors.12
On the other hand, the conductivity of quasi-insulators (including deionized water) and
semiconductors depends mostly on the carrier density n, which is much lower than in metals. From the
point of view of quantum mechanics, this happens because the ground-state wavefunctions of charge
carriers are localized within an atom (or molecule), and their energies are separated from those of
excited states, with space-extended wavefunctions, by a large energy gap – often called the bandgap.
For example, in SiO2 the bandgap approaches 9 eV, equivalent to ~4,000 K. This is why even at room
temperatures the density of thermally-excited free charge carriers in good insulators is negligible. In
these materials, n is determined by impurities and vacancies, and may depend on a particular chemical
synthesis or other fabrication technology, rather than on the fundamental properties of the material. (On
the contrary, the carrier mobility in these materials is almost technology-independent.)
10 Note that j in Eq. (8) is defined as an already macroscopic variable, averaged over inter-particle distances, so no
additional average sign is necessary in the first of Eqs. (13a).
11 For such a discussion see, e.g., SM Sec. 6.3.
12 The electrodynamic properties of superconductors are so interesting (and fundamentally important) that I will
discuss them in more detail in Chapter 6.
Chapter 4 Page 6 of 16
Essential Graduate Physics EM: Classical Electrodynamics
The practical importance of the fabrication technology may be illustrated by the following
example. In the cells of the so-called floating-gate memories, in particular, the flash memories, which
currently dominate the nonvolatile digital memory technology, data bits are stored as small electric
charges (Q ~ 10-16 C ~ 103 e) of highly doped silicon islands (so-called floating gates) separated from
the rest of the integrated circuit with ~10-nm-thick layers of silicon dioxide, SiO2. Such layers are
fabricated by high-temperature oxidation of virtually perfect silicon crystals. The conductivity of the
resulting high-quality (though amorphous) material is so low, ~ 10–19 S/m, that the relaxation time r,
defined by Eq. (10), is well above 10 years – the industrial standard for data retention in nonvolatile
memories. To appreciate how good this technology is, the cited value should be compared with the
typical conductivity ~ 10–16 S/m of the usual, bulk SiO2 ceramics.13
To conclude this chapter, let me note that the Ohm law, for all its importance, is not a universal
law of nature. As a reminder of this fact, in Sec. 5 below I describe two very simple systems (leaving
their analysis for the reader’s exercise) whose I-V relation is nonlinear even for very small currents.
For a uniform conductor ( = const), Eq. (14) is reduced to the Laplace equation for the (macroscopic)
electrostatic potential . As we already know from Chapters 2 and 3, its solution depends on the
boundary conditions. These conditions, in turn, depend on the interface type.
(i) Conductor-conductor interface. Applying the continuity equation (6) to a Gauss-type pillbox
at the interface of two different conductors (Fig. 5), we get
(jn)1 = (jn)2, (4.15)
so if the Ohm law (8) is valid inside each medium, then
1
1 2 2 . (4.16)
n n
1
j1 2 1
j2
Fig. 4.5. DC current’s “refraction” at the interface between
two different conductors.
13 This course is not an appropriate platform to discuss details of the floating-gate memory technology. However,
I think that every educated physicist should know its basics, because such memories are presently the driver of all
semiconductor integrated circuit technology development, and hence of the whole information technology
progress. Perhaps the best available general book on this topic is still the relatively old review collection by J.
Brewer and M. Gill (eds.), Nonvolatile Memory Technologies with Emphasis on Flash, IEEE Press, 2008.
Chapter 4 Page 7 of 16
Essential Graduate Physics EM: Classical Electrodynamics
Also, since the electric field should be finite, its potential has to be continuous across the
interface – the condition that may also be written as
1 2
. (4.17)
Both these conditions (and hence the solutions of the boundary problems using them) are similar to
those for the interface between two dielectrics – cf. Eqs. (3.46)-(3.47). Note that using the Ohm law, Eq.
(17) may be rewritten as
1
j 1 1 j 2 . (4.18)
1 2
Comparing it with Eq. (15) we see that, generally, the current density’s magnitude changes at the
interface: j1 j2. It is also curious that if 1 2, the current line slope changes at the interface (Fig. 5),
qualitatively similar to the refraction of light rays in optics – see Chapter 7.
where constants j may be different for different electrodes (numbered with index j). Note that with
such boundary conditions, the Laplace boundary problem becomes exactly the same as in electrostatics
– see Eq. (2.35) – and hence we can use the methods (and some solutions :-) discussed in Chapter 2 for
finding the dc current distribution.
(iii) Conductor-insulator interface. For the description of a good insulator, we can use the
equality = 0, so Eq. (16) yields the following boundary condition,
0, (4.20)
n
for the potential derivative inside the conductor. From the Ohm law (8) in the form j = –, we see
that this is just the very natural requirement for the dc current not to flow into an insulator. Now note
that this condition makes the Laplace problem inside the conductor completely well-defined, and
independent of the potential distribution in the adjacent insulator. On the contrary, due to the continuity
of the electrostatic potential at the border, its distribution inside the surrounding insulator has to follow
that inside the conductor.
Let us discuss this conceptual issue on the following (apparently, trivial) example: dc current in
a uniform wire of length l and a cross-section of area A. The reader certainly knows the answer:
V V l Uniform
I , where R , (4.21) wire’s
R I A resistance
14The first of Eqs. (21) is essentially the (historically, initial) integral form of the Ohm law, and is valid not only
for a uniform wire but also for Ohmic conductors of any geometry in that I and V may be clearly defined.
Chapter 4 Page 8 of 16
Essential Graduate Physics EM: Classical Electrodynamics
However, let us derive this result formally from our theoretical framework. For the simple
geometry shown in Fig. 6a, this is easy to do. Here the potential evidently has a linear 1D distribution
x
const V , (4.22)
l
both in the conductor and the surrounding free space, with both boundary conditions (16) and (17)
satisfied at the conductor-insulator interfaces, and the condition (20) satisfied at the conductor-electrode
interfaces. As a result, the electric field is constant and has only one Cartesian component: Ex = V/l, so
inside the conductor
j x E x , I j x A , (4.23)
giving us the well-known Eq. (21).
(a) (b)
V 0 0
E V
0 l x
However, what about the geometry shown in Fig. 6b? In this case, the field distribution in the
free space around the conductor is dramatically different, but according to the boundary problem
defined by Eqs. (14) and (20), inside the conductor, the solution is exactly the same as it was in the
former case. Now, the Laplace equation in the surrounding insulator has to be solved with the boundary
values of the electrostatic potential, “dictated” by the distribution of the current (and hence potential) in
the conductor. Note that as a result, the electric field lines are generally not normal to the conductor’s
surface, because the surface is not equipotential – see Eq. (22) again.
Let us solve a problem in that this conduction hierarchy may be followed analytically to the very
end. Consider an empty spherical cavity carved in a conductor with an initially uniform current flow
with a constant density j0 = nzj0 (Fig. 7a). Following the hierarchy, we have to solve the boundary
problem in the conducting part of the system, i.e. outside the sphere (at r R), first. Since the problem is
evidently axially symmetric, we already know the general solution of the Laplace equation – see Eq.
(2.172). Moreover, we know that in order to match the uniform field distribution at r , all
coefficients al but one (a1 = –E0 = –j0/) have to be zero, and that the boundary conditions at r = R will
give zero solutions for all coefficients bl but one (b1), so
j0 b1
r cos cos , for r R . (4.24)
r2
In order to find the remaining coefficient b1, we have to use the boundary condition (20) at r = R:
Chapter 4 Page 9 of 16
Essential Graduate Physics EM: Classical Electrodynamics
j 2b
rR 0 31 cos 0 . (4.25)
r R
This gives b1 = –j0R3/2, so, finally,
j0 R3
(r , ) r 2 cos , for r R . (4.26)
2r
(Note that this potential distribution corresponds to the dipole moment p = –E0R3/2. It is straightforward
to check that if the spherical cavity was cut in a dielectric, the potential distribution outside it would be
similar, with p = –E0R3( – 1)/( + 2). In the limit , these two results coincide, despite the rather
different type of the problem: in the dielectric case, there is no current at all.)
(a) (b)
z j0
R
0
Fig. 4.7. A spherical cavity carved in a uniform conductor: (a) the problem’s geometry, and (b) the
equipotential surfaces as given by Eqs. (26) and (28).
Now, as the second step in the conductivity hierarchy, we may find the electrostatic potential
distribution (r,) in the insulator, in this particular case inside the empty cavity (at r R). It should also
satisfy the Laplace equation with the boundary values at r = R, “dictated” by the distribution (26):
3 j0
( R, )
R cos . (4.27)
2
We could again solve this problem by the formal variable separation (keeping in the general solution
(2.172) only the term proportional to al, which does not diverge at r 0), but if we notice that the
boundary condition (27) depends on just one Cartesian coordinate, z = Rcos, the solution may be just
guessed:
3 j 3 j
(r , ) 0 z 0 r cos , at r R . (4.28)
2 2
Indeed, it evidently satisfies the Laplace equation and the boundary condition (27), and corresponds to a
constant electric field parallel to the vector j0 and equal to 3j0/2 – see Fig. 7b. Again, the cavity surface
is not equipotential, and the electric field lines at r R are not normal to it at almost all points.
More generally, the conductivity hierarchy says that static electrical fields and charges outside
conductors (e.g., electric wires) do not affect currents flowing in the wires, and it is physically very clear
Chapter 4 Page 10 of 16
Essential Graduate Physics EM: Classical Electrodynamics
why. For example, if a charge in the free space is slowly moved close to a wire, it (in accordance with
the linear superposition principle) only induces an additional surface charge (see Sec. 2.1) that screens
the external charge’s field, without participating in the current flow inside the conductor.
Besides this conceptual issue, the two examples given above may be considered as applications
of the first two methods discussed in Chapter 2 – the orthogonal coordinates (Fig. 6) and the variable
separation (Fig. 7) – to dc current distribution problems. As the reader may recall, in that chapter we
also discussed the method of charge images. It turns out that its analog may be also used for the solution
of some dc conductivity problems. Indeed, let us consider a spherically-symmetric potential distribution
of the electrostatic potential, similar to that given by the basic Eq. (1.35):
c
. (4.29)
r
As we know from Chapter 1, this is a particular solution of the 3D Laplace equation at all points but r =
0. In free space, this distribution would correspond to a point charge q = 40c; but what about a
uniform Ohmic conductor? Calculating the corresponding electric field and current density,
c c
E r, j E r, (4.30)
r3 r3
we see that the total current flowing from the origin through a sphere of an arbitrary radius r does not
depend on the radius:
I Aj 4 r 2 j 4 c. (4.31)
15 Note that in such layers, the current distribution near the injection point is different, j 1/r rather than 1/r2.
Chapter 4 Page 11 of 16
Essential Graduate Physics EM: Classical Electrodynamics
I 1 1
.
(r ) (4.34)
4 r - r' r - r"
(The image current’s sign would be opposite at the interface between a conductor with moderate
conductivity and a perfect conductor (“electrode”), whose potential should be virtually constant.)
I
r'
jr
d
0
d
Fig. 4.8. Applying the method of images
r" for the current injection analysis.
I
This result may be readily used, for example, to calculate the current density at a plane surface of
a uniform conductor, as a function of distance from point 0 – the surface’s point closest to the current
injection site – see Fig. 8. At such surface, Eq. (34) yields
I 1 , (4.35)
2 2 d 2 1/ 2
Deviations from Eqs. (35) and (36) may be used to find and characterize conductance
inhomogeneities, say, those due to mineral deposits in the Earth’s crust.16
So, the methods used in electrostatics to calculate the potential distribution in linear dielectrics
may be also used to find such distributions in Ohmic conductors. Moreover, some of these methods are
more valuable in this field. For example, in electrostatics, the effective methods of solution of the 2D
Laplace equation, discussed in Secs. 2.3-2.6, could be only applied to cylindrical geometries. At Ohmic
conduction, this equation is also valid in some 3D cases. A practically important example is the current
flow in thin resistive layers where, due to the conductivity hierarchy principle, the 3D-distributed field
outside a layer, induced by the 2D-distributed current in it, does not affect the flow and in many cases is
not important. A few problems of this kind, formulated in Sec. 5, are left for the reader’s exercise.
16The current injection may be also produced, due to electrochemical reactions, by an ore mass itself, so one need
only measure (and correctly interpret :-) the resulting potential distribution – the so-called self-potential method –
see, e.g., Sec. 6.1 in W. Telford et al., Applied Geophysics, 2nd ed., Cambridge U. Press, 1990.
Chapter 4 Page 12 of 16
Essential Graduate Physics EM: Classical Electrodynamics
conduction, the electrostatic energy U is “dissipated” (i.e. transferred to heat) at a certain rate P –
dU/dt, with the dimensionality of power.17 The rate of this energy dissipation may be evaluated by
calculating the power of the electric field’s work on a single moving charge:
P1 F v qE v . (4.37)
After the summation over all charges, Eq. (37) gives us the average power of energy dissipation.
If the charge density n is uniform, multiplying by it both parts of this relation, and taking into account
that qnv = j, for the energy dissipation rate in a unit volume we get the following Joule law18
General P P1 N
Joule p P1 n qE vn E j . (4.38)
law V V
In the case of the Ohmic conductivity (8), this expression may be also rewritten in two other forms:
Joule law
j2
for Ohmic p E 2
. (4.39)
conductivity
With our electrostatics background, it is also straightforward (and hence left for the reader’s exercise) to
prove that the dc current distribution in a uniform Ohmic conductor, at a fixed voltage distribution along
its surface, corresponds to the minimum of the total dissipation in the sample,
P pd 3 r E 2 d 3 r . (4.40)
V V
17 If this electric field and hence the electrostatic energy are time-independent, the energy is replenished at the
same rate from the current source(s).
18 Named after James Prescott Joule, who quantified this effect in 1841.
Chapter 4 Page 13 of 16
Essential Graduate Physics EM: Classical Electrodynamics
4.6. Calculate the effective (average) conductivity ef of a
medium with many empty spherical cavities of radius R, carved at R
random positions in a uniform Ohmic conductor (see the figure on the
right), in the limit of a low density n << R–3 of the cavities.
Hint: You may like to use the analogy with an electric-dipole
medium – see, e.g., Sec. 3.2.
4.7. In two separate experiments, a narrow gap, possibly of irregular width, between two close,
perfectly conducting electrodes is filled with some material: in the first case, a uniform linear dielectric
with an electric permittivity , and in the second case, a uniform conducting material with an Ohmic
conductivity . Neglecting the fringe effects, calculate the relation between the mutual capacitance C
between the electrodes (in the first case) and the dc resistance R between them (in the second case).
I R I
4.9. Calculate the distribution of the dc current’s density in a thin, round,
uniform resistive disk, if the current is inserted into some point at the disk’s rim,
and picked up in its center – see the figure on the right.
Chapter 4 Page 14 of 16
Essential Graduate Physics EM: Classical Electrodynamics
I I
4.10. DC current is passed between two point electrodes
connected to a wide, thin, uniform resistive sheet – see the figure on
the right. Use the model solution of the previous problem to prove,
V
without much new calculation, that cutting a round hole in the sheet
(outside of the current injection/extraction points) doubles the voltage
between any two points on its border.
4.13.* The simplest reasonable model of a vacuum diode consists of two parallel planar metallic
electrodes of area A, separated by a gap of thickness d << A1/2: a “cathode” that emits electrons into the
gap, and an “anode” that absorbs the electrons arriving from the gap at its surface. Calculate the dc I-V
curve of the diode, i.e. the relation between the average current I flowing between the electrodes and the
dc voltage V applied between them, using the following simplifying assumptions:
(i) due to the effect of the negative space charge of the emitted electrons, the current I is much
lower than the emission ability of the cathode,
(ii) the initial velocity of the emitted electrons is negligible, and
(iii) the direct Coulomb interaction of electrons (besides the space charge effect) is negligible.
4.14.* Calculate the space-charge-limited current in a system with the same geometry as in the
previous problem, and using the same assumptions besides that now the emitted charge carriers do not
fly ballistically, but rather drift in accordance with the Ohm law, with the conductivity given by Eq.
(13): = q2n, with a constant mobility .19
Hint: In order to get a realistic result, assume that the medium in which the charge carriers move
has a certain dielectric constant unrelated to the carriers.
19 As was mentioned in Sec. 2, the approximation of a constant (in particular, field- and charge-density-
independent) mobility is most suitable for semiconductors.
Chapter 4 Page 15 of 16
Essential Graduate Physics EM: Classical Electrodynamics
4.15. Prove that the distribution of dc currents in a uniform Ohmic conductor with a given
voltage distribution along its surface corresponds to the minimum of the total energy dissipation rate
(“Joule heat”).
Chapter 4 Page 16 of 16
Essential Graduate Physics EM: Classical Electrodynamics
Chapter 5. Magnetism
Even though this chapter addresses a completely new type of electric charge interaction, its discussion
(for the stationary case) will take not too much time/space, because it recycles many ideas and methods
of electrostatics, though with a twist or two.
j'
j r r' d 3 r'
dF V'
V
d 3r Fig. 5.1. Magnetic
interaction of two
currents.
According to the Coulomb law, there is no electrostatic force between them. However, several
experiments carried out in 18201 proved that there is a different, magnetic interaction between the
currents. In the present-day notation, the results of all such experiments may be summarized with just
one formula, in SI units expressed as
0 r r'
Magnetic F d 3 r d 3 r' j(r ) j' (r' ) . (5.1)
force 4 V V' r r'
3
Here the coefficient 0/4 (where 0 is called either the magnetic constant or the free space
permeability) equals almost exactly 10-7 SI units, with the product 00 equal to exactly 1/c2. 2
Note a close similarity of this expression to the Coulomb law (1.1) rewritten for the interaction
of two continuously distributed charges, with the account of the linear superposition principle (1.4):
Electric 1 r r'
F d rd r' (r ) ' (r' )
3 3
force . (5.2)
4 0 V V' r r'
3
1 Most notably, by Hans Christian Ørsted who discovered the effect of electric currents on magnetic needles, and
André-Marie Ampère who extended this work by finding the magnetic interaction between two currents.
2 For details, see Appendix UCA: Selected Units and Constants. In the Gaussian units, the coefficient /4 in
0
Eq. (1) and beyond is replaced with 1/c2.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
Besides the different coefficient and a different sign, the “only” difference of Eq. (1) from Eq. (2) is the
scalar product of the current densities, evidently necessary because of their vector character. We will see
soon that this difference brings certain complications in applying the approaches discussed in the
previous chapters, to magnetostatics.
Before going to their discussion, let us have one more glance at the coefficients in Eqs. (1) and
(2). To compare them, let us consider two objects with uncompensated charge distributions (r) and
’(r), moving parallel to each other with certain velocities v and v’, as measured in the same inertial
(“laboratory”) reference frame. In this case, j(r) = (r)v, so j(r)j’(r) = (r)’(r)vv’, and the integrals in
Eqs. (1) and (2) become functionally similar, differing only by the factor
Fmagnetic 0 vv' 1 vv'
. (5.3)
Felectric 4 4 0 c2
(The last expression is valid in any consistent system of units.) We immediately see that magnetism is
an essentially relativistic phenomenon, very weak in comparison with the electrostatic interaction at the
human scale velocities, v << c, and may dominate only if the latter interaction vanishes – as it does in
electroneutral systems.3 The discovery and initial studies4 of such a subtle, relativistic phenomenon as
magnetism were much facilitated by the relative abundance of natural ferromagnets: materials with a
spontaneous magnetic polarization, whose strong magnetic field is due to relativistic effects (such as
spin) inside the constituent atoms – see Sec. 5 below.
Also, Eq. (3) points to an interesting paradox. Consider two electron beams moving parallel to
each other, with the same velocity v with respect to a lab reference frame. Then, according to Eq. (3),
the net force of their total (electric plus magnetic) interaction is proportional to (1 – v2/c2), tending to
zero in the limit v c. However, in the reference frame moving together with the electrons, they are not
moving at all, i.e. v = 0. Hence, from the point of view of such a moving observer, the electron beams
should interact only electrostatically, with a repulsive force independent of the velocity v. Historically,
this had been one of several paradoxes that led to the development of special relativity; its resolution
will be discussed in Chapter 9 devoted to this theory.
Returning to Eq. (1), in some simple cases the double integration in it may be carried out
analytically. First of all, let us simplify this expression for the case of two thin, long conductors
(“wires”) separated by a distance much larger than their thickness. In this case, we may integrate the
products jd3r and j’d3r’ over the wires’ cross-sections first, neglecting the corresponding change of the
factor (r – r’). Since the integrals of the current density over the cross-sections of the wires are just the
currents I and I’ flowing in the wires, and cannot change along their lengths (say, l and l’, respectively),
they may be taken out of the remaining integrals, reducing Eq. (1) to
0 II'
F dr dr' r r' 3 . (5.4)
4 l l ' r r'
3 An important case when the electroneutrality may not hold is the motion of electrons in free space. (However, in
this case, the electron speed is often comparable with the speed of light, so the magnetic forces may be
comparable in strength with electrostatic forces, and hence important.) Minor local violations of electroneutrality
also play an important role in some semiconductor devices – see, e.g., SM Chapter 6.
4 The first detailed book on this subject, De Magnete by William Gilbert (a.k.a. Gilberd), was published as early
as 1600.
Chapter 5 Page 2 of 42
Essential Graduate Physics EM: Classical Electrodynamics
As the simplest example, consider two straight, parallel wires (Fig. 2) separated by distance d,
both with length l >> d.
F nyF
dFy dF
I dr x
r - r'
d
Fig. 5.2. The magnetic force between
I' two straight parallel currents.
dr' x'
Due to the symmetry of this system, the vector of the magnetic interaction force has to:
(i) lie in the same plane as the currents, and
(ii) be normal to the wires – see Fig. 2.
Hence we may limit our calculations to just one component of the force – normal to the wires. Using the
fact that with the coordinate choice shown in Fig. 2, the scalar product drdr’ is just dxdx’, we get
0 II' sin 0 II' d
F dx dx '
4 d ( x x' )
2 2
dx dx'
4 d 2 ( x x' ) 2 3 / 2
. (5.5)
Now introducing, instead of x’, a new, dimensionless variable (x – x’)/d, we may reduce the internal
integral to a table one, which we have already encountered in this course:
0 II' d 0 II'
4 d 1 2 3 / 2 2 d
F dx dx . (5.6)
The integral over x formally diverges, but it gives a finite interaction force per unit length of the wires:
F II'
0 . (5.7)
l 2 d
Note that the force drops rather slowly (only as 1/d) as the distance d between the wires is increased,
and is attractive (rather than repulsive as in the Coulomb law) if the currents are of the same sign.
This is an important result,5 but again, the problems so simply solvable are few and far between,
and it is intuitively clear that we would strongly benefit from the same approach as in electrostatics, i.e.,
from decomposing Eq. (1) into a product of two factors via the introduction of a suitable field. Such
decomposition may be done as follows:
Lorentz
force: F j(r ) B(r )d 3 r , (5.8)
current
V
5In particular, until very recently (2018), Eq. (7) was used for the legal definition of the SI unit of current, the
ampere (A), via the SI unit of force (the newton, N), with the coefficient 0 considered exactly fixed. (A brief
description of the recent changes in legal metrology is given in Appendix UCA.)
Chapter 5 Page 3 of 42
Essential Graduate Physics EM: Classical Electrodynamics
where the vector B is called the magnetic field.6 In the case when it is induced by the current j’:
0 r r' 3 Biot-
B(r )
4 V '
j' (r' )
r r'
3
d r' . (5.9) Savart
law
The last relation is called the Biot-Savart law,7 while the force F expressed by Eq. (8) is sometimes
called the Lorentz force.8 However, more frequently the latter term is reserved for the full force,
Lorentz
F q E v B , (5.10) force:
particle
exerted by electric and magnetic fields field on a point charge q, moving with velocity v.9
Now we have to prove that the new formulation, given by Eqs. (8)-(9), is equivalent to Eq. (1).
At first glance, this seems unlikely. Indeed, first of all, Eqs. (8) and (9) involve vector products, while
Eq. (1) is based on a scalar product. More profoundly, in contrast to Eq. (1), Eqs. (8) and (9) do not
satisfy the 3rd Newton’s law applied to elementary current components jd3r and j’d3r’, if these vectors
are not parallel to each other. Indeed, consider the situation shown in Fig. 3.
jd 3 r r r'
dB' 0 dB' 0 j'd 3 r' Fig. 5.3. The apparent violation of the
3rd Newton law in magnetism.
dF 0 dF ' 0
Here the vector j’ is perpendicular to the vector (r – r’), and hence, according to Eq. (9),
produces a non-zero contribution dB’ to the magnetic field directed (in Fig. 3) normally to the plane of
the drawing, i.e. perpendicular to the vector j. Hence, according to Eq. (8), this field provides a non-zero
contribution to F. On the other hand, if we calculate the reciprocal force F’ by swapping the prime
indices in Eqs. (8) and (9), the latter equation immediately shows that dB(r’) j(r’ – r) = 0, because
the two operand vectors are parallel – see Fig. 3 again. Hence, the current component j’d3r’ does exert a
force on its counterpart, while jd3r does not.
6 The SI unit of the magnetic field is called tesla (T) – after Nikola Tesla, a pioneer of electrical engineering. In
the Gaussian units, the already discussed constant 1/c2 in Eq. (1) is equally divided between Eqs. (8) and (9), so in
them both, the constant before the integral is 1/c. The resulting Gaussian unit of the field B is called gauss (G);
taking into account the difference of units of electric charge and length, and hence of the current density, 1 G
equals exactly 10-4 T. Note also that in some textbooks, especially old ones, B is called either the magnetic
induction or the magnetic flux density, while the term “magnetic field” is reserved for the field H that will be
introduced in Sec. 5 below.
7 Named after Jean-Baptiste Biot and Félix Savart who made several key contributions to the theory of magnetic
interactions – in the same notorious 1820.
8 Named after Hendrik Antoon Lorentz, famous mostly for his numerous contributions to the development of
special relativity – see Chapter 9 below. To be fair, the magnetic part of the Lorentz force was implicitly
described in a much earlier (1865) paper by J. C. Maxwell and then spelled out by Oliver Heaviside (another
genius of electrical engineering – and mathematics!) in 1889, i.e. also before the 1895 work by H. Lorentz.
9 From the magnetic part of Eq. (10), Eq. (8) may be derived by the elementary summation of all forces acting on
n >> 1 particles in a unit volume, with j = qnv – see the footnote on Eq. (4.13a). On the other hand, the reciprocal
derivation of Eq. (10) from Eq. (8) with j = qv(r – r0), where r0 is the current particle’s position (so dr0/dt = v),
requires certain mathematical care and will be performed in Chapter 9.
Chapter 5 Page 4 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Despite this apparent problem, let us still go ahead and plug Eq. (9) into Eq. (8):
r r'
F 0 d 3 r d 3 r' j(r ) j' (r' ) . (5.11)
4 V r r'
3
V'
This double vector product may be transformed into two scalar products, using the vector algebraic
identity called the bac minus cab rule, a(bc) = b(ac) – c(ab).10 Applying this relation, with a = j, b
= j’, and c = R r – r’, to Eq. (11), we get
0 j(r ) R 0 R
F d 3 r' j' (r' ) d 3 r
3 3
4 d r d r' j(r ) j' (r' ) . (5.12)
4 V' V R 3
V V' R 3
The second term on the right-hand side of this equality coincides with the right-hand side of Eq. (1),
while the first term equals zero because its internal integral vanishes. Indeed, we may break the volumes
V and V’ into narrow current tubes – the stretched elementary volumes whose walls are not crossed by
current lines (so on their walls, jn = 0). As a result, the elementary current in each tube, dI = jdA = jd2r,
is the same along its length, and, just as in a thin wire, jd2r may be replaced with dIdr, with the vector
dr directed along j. Because of this, each tube’s contribution to the internal integral in the first term of
Eq. (12) may be represented as
R 1 1
dI dr dI dr dI dr , (5.13)
l R 3
l
R l
r R
where the operator acts in the r-space, and the integral is taken along the tube’s length l. Due to the
current continuity expressed by Eq. (4.6), each loop should follow a closed contour, and an integral of a
full differential of some scalar function (in our case, of 1/R) along such contour equals zero.
So we have recovered Eq. (1). Returning for a minute to the paradox illustrated in Fig. 3, we may
conclude that the apparent violation of the 3rd Newton law was the artifact of our interpretation of Eqs.
(8) and (9) as the sums of independent elementary components. In reality, due to the dc current
continuity, these components are not independent. For the whole currents, Eqs. (8)-(9) do obey the 3rd
law – as follows from their already proved equivalence to Eq. (1).
Thus it is possible to break the magnetic interaction into two effects: the induction of the
magnetic field B by one current (in our notation, j’), and the effect of this field on the other current (j).
Now comes an additional experimental fact: other elementary components jd3r’ of the current j(r) also
contribute to the magnetic field (9) acting on the component jd3r.11 This fact allows us to drop the prime
sign after j in Eq. (9), and rewrite Eqs. (8) and (9) as
0 r r' 3
B(r )
4 V'
j(r' )
r r'
3
d r' , (5.14)
Chapter 5 Page 5 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Again, the field observation point r and the field source point r’ have to be clearly distinguished. We
immediately see that these expressions are close to, but still different from the corresponding relations of
the electrostatics, namely Eq. (1.9) and the distributed-charge version of Eq. (1.6):
1 r r'
E(r )
4 0 (r' ) r r'
V'
3
d 3 r' , (5.16)
F (r ) E(r )d 3 r . (5.17)
V
(Note that the sign difference has disappeared, at the cost of the replacement of scalar-by-vector
multiplications in electrostatics with cross-products of vectors in magnetostatics.)
For the frequent particular case of a thin wire of length l’, Eq. (14) may be re-written as
0 I r r'
B(r )
4 l' dr'
r r'
3
. (5.18)
Let us see how this formula works for the simplest case of a straight wire (Fig. 4a). The magnetic field
contributions dB due to all small fragments dr’ of the wire’s length are directed along the same line
(perpendicular to both the wire and the shortest distance d from the observation point to the wire’s line),
and its magnitude is
I dx' I dx' d
dB 0 sin 0 . (5.19)
4 r r' 2
4 d x d 2 x 2 1 / 2
2 2
(a) dB z dB (b)
dB z
r r'
d r'
d
I
0 R
r r'
dr' I
Fig. 5.4. Calculating magnetic fields: (a) of a straight current, and (b) of a current loop.
This is a simple but important result. (Note that it is only valid for very long (l >> d), straight
wires.) It is especially crucial to note the “vortex” character of the field: its lines go around the wire,
forming rings with the centers on the current line. This is in sharp contrast to the electrostatic field lines,
which can only begin and end on electric charges and never form closed loops (otherwise the Coulomb
force qE would not be conservative). In the magnetic case, the vortex structure of the field may be
reconciled with the potential character of the magnetic forces, which is evident from Eq. (1), due to the
vector products in Eqs. (14)-(15).
Chapter 5 Page 6 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Now we may readily use Eq. (15), or rather its thin-wire version
F I dr B(r ) , (5.21)
l
to apply Eq. (20) to the two-wire problem (Fig. 2). Since for the second wire, the vectors dr and B are
perpendicular to each other, we immediately arrive at our previous result (7), which was obtained
directly from Eq. (1).
The next important example of the application of the Biot-Savart law (14) is the magnetic field at
the axis of a circular current loop (Fig. 4b). Due to the problem’s symmetry, the net field B has to be
directed along the axis, but each of its elementary components dB is tilted by the angle = tan-1(z/R) to
this axis, so its axial component is
I dr' R
dB z dB cos 0 . (5.22)
4 R z R 2 z 2 1 / 2
2 2
Since the denominator of this expression remains the same for all wire components dr’, the integration
over r’ is easy (dr’ = 2R), giving finally
0 I R2
B . (5.23)
2 R 2
z2
3/ 2
Note that the magnetic field in the loop’s center (i.e., for z = 0),
0 I
B, (5.24)
2R
is times higher than that due to a similar current in a straight wire, at the distance d = R from it. This
difference is readily understandable, since all elementary components of the loop are at the same
distance R from the observation point, while in the case of a straight wire, all its points but one are
separated from the observation point by distances larger than d.
Another notable fact is that at large distances (z2 >> R2), the field (23) is proportional to z-3:
0 I R 2 0 2m
B , with m IA , (5.25)
2 z
3
4 z 3
where A = R2 is the loop area. Comparing this expression with Eq. (3.13), for the particular case = 0,
we see that such field is similar to that of an electric dipole (at least along its direction), with the
replacement of the electric dipole moment magnitude p with the m so defined – besides the front factor.
Indeed, such a plane current loop is the simplest example of a system whose field, at distances much
larger than R, is that of a magnetic dipole, with a dipole moment m – the notions to be discussed in
much more detail in Sec. 4 below.
Chapter 5 Page 7 of 42
Essential Graduate Physics EM: Classical Electrodynamics
substantially simplified using the Gauss law (1.16). A similar relation exists in magnetostatics as well,
but has a different form, due to the vortex character of the magnetic field.
To derive it, let us notice that in an analogy with the scalar case, the vector product under the
integral (14) may be transformed as
j (r' ) (r r' ) j(r' )
, (5.26)
r r'
3
r r'
where the operator acts in the r-space. (This equality may be readily verified by Cartesian
components, noticing that the current density is a function of r’ and hence its components are
independent of r.) Plugging Eq. (26) into Eq. (14), and moving the operator out of the integral over r’,
we see that the magnetic field may be represented as the curl of another vector field – the so-called
vector potential, defined as:12
B(r ) A(r ) , (5.27)
and in our current case equal to Vector
potential
0 j(r' ) 3
4 V' r r'
A(r ) d r' . (5.28)
Please note a beautiful analogy between Eqs. (27)-(28) and, respectively, Eqs. (1.33) and (1.38).13 This
analogy implies that the vector potential A plays, for the magnetic field, essentially the same role as the
scalar potential plays for the electric field (hence the name “potential”), with due respect to the vortex
character of B. This notion will be discussed in more detail below.
Now let us see what equations we may get for the spatial derivatives of the magnetic field. First,
vector algebra says that the divergence of any curl is zero.14 In application to Eq. (27), this means that
No
B 0 . (5.29) magnetic
monopoles
Comparing this equation with Eq. (1.27), we see that Eq. (29) may be interpreted as the absence of a
magnetic analog of an electric charge, on which magnetic field lines could originate or end. Numerous
searches for such hypothetical magnetic charges, called magnetic monopoles, using very sensitive and
sophisticated experimental setups,15 have not given any reliable evidence of their existence in Nature.
Proceeding to the alternative, vector derivative of the magnetic field, i.e. to its curl, and using
Eq. (28), we obtain
j(r' ) 3
B (r ) 0 d r' (5.30)
4 V'
r r'
This expression may be simplified by using the following general vector identity:16
c c 2 c , (5.31)
applied to vector c(r) j(r’)/r – r’:
12 In the Gaussian units, Eq. (27) remains the same, and hence in Eq. (28), 0/4 is replaced with 1/c.
13 In Eq. (1.38), there was no real need for the additional clarification provided by the integration volume label V’.
14 See, e.g., MA Eq. (11.2).
15 For a recent example, see B. Acharya et al., Nature 602, 63 (2022).
16 See, e.g., MA Eq. (11.3).
Chapter 5 Page 8 of 42
Essential Graduate Physics EM: Classical Electrodynamics
0 1 1
B j(r' ) d 3 r' 0 j(r' ) 2 d 3 r' . (5.32)
4 V' r r' 4 V' r r'
As was already discussed during our study of electrostatics in Sec. 3.1,
1
2 4 (r r' ) , (5.33)
r r'
so the last term of Eq. (32) is just 0j(r). On the other hand, inside the first integral, we can replace
with (–’), where prime means the differentiation in the space of the radius vectors r’. Integrating that
term by parts, we get
1 ' j(r' ) 3
B 0 j n (r' ) d 2 r' d r' 0 j(r ) . (5.34)
4 S ' r r' V'
r r'
Applying this equality to the volume V’ limited by a surface S’ either sufficiently distant from the field
concentration or with no current crossing it, we may neglect the first term on the right-hand side of Eq.
(34), while the second term always equals zero in statics, due to the dc charge continuity – see Eq. (4.6).
As a result, we arrive at a very simple differential equation17
B 0 j . (5.35)
This is (the dc form of) the inhomogeneous Maxwell equation – which in magnetostatics plays a
role similar to Eq. (1.27) in electrostatics. Let me display, for the first time in this course, this
fundamental system of equations (at this stage, for statics only), and give the reader a minute to stare, in
silence, at their beautiful symmetry – which has inspired so much of the later development of physics:
Maxwell
E 0, B 0 j,
equations:
(5.36)
statics E , B 0.
0
Their only asymmetry, two zeros on the right-hand sides (for the magnetic field’s divergence and
electric field’s curl), is due to the absence in the Nature of magnetic monopoles and their currents. I will
discuss these equations in more detail in Sec. 6.7, after the first two equations (for the fields’ curls) have
been generalized to their full, time-dependent versions.
Returning now to our current, more mundane but important task of calculating the magnetic field
induced by simple current configurations, we can benefit from an integral form of Eq. (35). For that, let
us integrate this equation over an arbitrary surface S limited by a closed contour C, and apply to the
result the Stokes theorem.18 The resulting expression,
B dr j d r 0 I ,
2
Ampère
0 n (5.37)
law
C S
where I is the net electric current crossing surface S, is called the Ampère law.
17 As in all earlier formulas for the magnetic field, in the Gaussian units, the coefficient 0 in this relation is
replaced with 4/c.
18 See, e.g., MA Eq. (12.1) with f = B.
Chapter 5 Page 9 of 42
Essential Graduate Physics EM: Classical Electrodynamics
As the first example of its application, let us return to the current in a straight wire (Fig. 4a).
With the Ampère law in our arsenal, we can readily pursue an even more ambitious goal than that
achieved in the previous section, namely to calculate the magnetic field both outside and inside of a wire
of an arbitrary radius R, with an arbitrary (albeit axially-symmetric) current distribution j() – see Fig. 5.
z
j R
C R C R B
Selecting the Ampère-law contour C in the form of a ring of some radius in the plane normal to
the wire’s axis z, we have Bdr = Bd, where is the azimuthal angle, so Eq. (37) yields:
2 j ( ' ) 'd' , for R,
0
2 B 0 R (5.38)
2 j ( ' ) ' d' I , for R.
0
Thus we have not only recovered our previous result (20), with the notation replacement d , in a
much simpler way but also have been able to calculate the magnetic field’s distribution inside the wire.
In the most common particular case when the current is uniformly distributed along its cross-section,
j() = const, the first of Eqs. (38) immediately yields B for R.
Another important system is a straight, long solenoid (Fig. 6a), with dense winding: n2A >> 1,
where n is the number of wire turns per unit length, and A is the area of the solenoid’s cross-section.
I (a) (b)
B
C1
B
I
l N
From the symmetry of this problem, the longitudinal (in Fig. 6a, vertical) component Bz of the
magnetic field may only depend on the distance of the observation point from the solenoid’s axis. First
taking a plane Ampère contour C1, with both long sides outside the solenoid, we get Bz(2) – Bz(1) = 0,
Chapter 5 Page 10 of 42
Essential Graduate Physics EM: Classical Electrodynamics
because the total current piercing the contour equals zero. This is only possible if Bz = 0 at any outside
of the solenoid, provided that it is infinitely long.19 With this result on hand, from the Ampère law
applied to the contour C2, we get the following relation for the only (z-) component of the internal field:
Bl 0 NI , (5.39)
where N is the number of wire turns passing through the contour of length l. This means that regardless
of the exact position of the internal side of the contour, the result is the same:
N
B 0 I 0 nI . (5.40)
l
Thus, the field inside an infinitely long solenoid (with an arbitrary shape of its cross-section) is uniform;
in this sense, a long solenoid is a magnetic analog of a wide plane capacitor, explaining why this system
is so widely used in physical experiment.
As should be clear from its derivation, the obtained results, especially that the field outside of the
solenoid equals zero, are conditional on the solenoid length being very large in comparison with its
lateral size. (From Eq. (25), we may predict that for a solenoid of a finite length l, the close-range
external field is a factor of ~A/l2 lower than the internal one.) A much better suppression of such
“fringe” fields may be obtained using toroidal solenoids (Fig. 6b). The application of the Ampère law to
this geometry shows that in the limit of dense winding (N >> 1), there is no fringe field at all (for any
relation between the two radii of the torus), while inside the solenoid, at distance from the system’s
axis,
NI
B 0 . (5.41)
2
We see that a possible drawback of this system for practical applications is that the internal field does
depend on , i.e. is not quite uniform; however, if the torus is relatively thin, this deficiency is minor.
Next let us discuss a very important question: how can we solve the problems of magnetostatics
for systems whose low symmetry does not allow getting easy results from the Ampère law? (The
examples are of course too numerous to list; for example, we cannot use this approach even to reproduce
Eq. (23) for a round current loop.) From the deep analogy with electrostatics, we may expect that in this
case, we could calculate the magnetic field by solving a certain boundary problem for the field’s
potential – in our current case, the vector potential A defined by Eq. (28). However, despite the
similarity of this formula and Eq. (1.38) for , which was noticed above, there is an additional issue we
should tackle in the magnetic case – besides the obvious fact that calculating the vector potential
distribution means determining three scalar functions (say, Ax, Ay, and Az), rather than just one ().
To reveal the issue, let us plug Eq. (27) into Eq. (35):
A 0 j , (5.42)
and then apply to the left-hand side of this equation the same identity (31). The result is
19Applying the Ampère law to a circular contour of radius , coaxial with the solenoid, we see that the field
outside (but not inside!) it has an azimuthal component B, similar to that of the straight wire (see Eq. (38) above)
and hence (at N >> 1) much weaker than the longitudinal field inside the solenoid – see Eq. (40).
Chapter 5 Page 11 of 42
Essential Graduate Physics EM: Classical Electrodynamics
( A) 2 A 0 j . (5.43)
On the other hand, as we know from electrostatics (please compare Eqs. (1.38) and (1.41)), the vector
potential A(r) given by Eq. (28) has to satisfy a simpler (“vector-Poisson”) equation
Poisson
2 A 0 j , (5.44) equation
for A
which is just a set of three usual Poisson equations for each Cartesian component of A.
To resolve the difference between these results, let us note that Eq. (43) is reduced to Eq. (44) if
A = 0. In this context, let us discuss what discretion we have in the choice of the potential. In
electrostatics, we may add, to the scalar function ’ that satisfies Eq. (1.33) for the given field E, not
only an arbitrary constant but even an arbitrary function of time:
' f (t ) ' E . (5.45)
Similarly, using the fact that the curl of the gradient of any scalar function equals zero,20 we may add to
any vector function A’ that satisfies Eq. (27) for the given field B, not only any constant but even a
gradient of an arbitrary scalar function (r, t), because
For any choice of such a function A’, we can always choose the function in such a way that it satisfies
the Poisson equation 2 = –A’, and hence makes the divergence of the transformed vector potential,
A A’ + , equal to zero everywhere,
Coulomb
A 0, (5.48) gauge
thus reducing Eq. (43) to Eq. (44).
To summarize, the set of distributions A’(r) that satisfy Eq. (27) for a given field B(r), is not
limited to the vector potential A(r) given by Eq. (44), but is reduced to it upon the additional Coulomb
gauge condition (48). However, as we will see in a minute, even this condition still leaves some degrees
of freedom in the choice of the vector potential. To illustrate this fact, and also to get a better gut feeling
of the vector potential’s distribution in space, let us calculate A(r) for two very basic cases.
First, let us revisit the straight wire problem shown in Fig. 5. As Eq. (28) shows, in this case the
vector potential A has just one component (along the axis z). Moreover, due to the problem’s axial
symmetry, its magnitude may only depend on the distance from the axis: A = nzA(). Hence, the
gradient of A is directed across the z-axis, so Eq. (48) is satisfied at all points. For our symmetry (/ =
/z = 0), the Laplace operator, written in cylindrical coordinates, has just one term,22 reducing Eq. (44)
to
Chapter 5 Page 12 of 42
Essential Graduate Physics EM: Classical Electrodynamics
1 d dA
0 j( ) . (5.49)
d d
Multiplying both sides of this equation by and integrating them over the coordinate once, we get
dA
0 j ( ' ) 'd' const . (5.50)
d 0
Since in the cylindrical coordinates, for our symmetry, B = –dA/d,23 Eq. (50) is nothing else than our
old result (38) for the magnetic field.24 However, let us continue the integration, at least for the region
outside the wire, where the function A() depends only on the full current I rather than on the current
distribution. Dividing both parts of Eq. (50) by , and integrating them over this argument again, we get
0 I R
A ln const, where I 2 j ( ) d , for R . (5.51)
2 0
As a reminder, we had similar logarithmic behavior for the electrostatic potential outside a uniformly
charged straight line. This is natural because the Poisson equations for both cases are similar.
Now let us find the vector potential for the long solenoid (Fig. 6a), with its uniform magnetic
field. Since Eq. (28) tells us that the vector A should follow the direction of the inducing current, we
may start by looking for it in the form A = n A(). (This is especially natural if the solenoid’s cross-
section is circular.) With this orientation of A, the same general expression for the curl operator in
cylindrical coordinates yields A = nz(1/)d(A)/d. According to Eq. (27), this expression should be
equal to B – in our current case to nzB, with a constant B – see Eq. (40). Integrating this equality, and
selecting such integration constant that A(0) is finite, we get
B B
A , i.e. A n . (5.52)
2 2
Plugging this result into the general expression for the Laplace operator in the cylindrical coordinates,25
we see that the Poisson equation (44) with j = 0 (i.e. the Laplace equation) is satisfied again – which is
natural since, for this distribution, the Coulomb gauge condition (48) is satisfied: A = 0.
However, Eq. (52) is not the unique (or even the simplest) vector potential that gives the same
uniform field B = nzB. Indeed, using the well-known expression for the curl operator in Cartesian
coordinates,26 it is straightforward to check that each of the vector functions A’ = nyBx and A”= –nxBy
also has the same curl, and also satisfies the Coulomb gauge condition (48).27 If such solutions do not
look very natural because of their anisotropy in the [x, y] plane, please consider the fact that they
represent the uniform magnetic field regardless of its source – for example, regardless of the shape of
the long solenoid’s cross-section. Such choices of the vector potential may be very convenient for some
Chapter 5 Page 13 of 42
Essential Graduate Physics EM: Classical Electrodynamics
problems, for example for the quantum-mechanical analysis of the 2D motion of a charged particle in
the perpendicular magnetic field, giving the famous Landau energy levels.28
But this is not the unique answer! Indeed, Eq. (53) describes the proper potential energy of the
system (in particular, giving the correct result for the current interaction forces) only in the case when
the interacting currents are fixed – just as Eq. (1.59) is adequate when the interacting charges are fixed.
Here comes a substantial difference between electrostatics and magnetostatics: due to the fundamental
fact of electric charge conservation (already discussed in Secs. 1.1 and 4.1), keeping electric charges
fixed does not require external work, while the maintenance of currents generally does. As a result, Eq.
(53) describes the energy of the magnetic interaction plus of the system keeping the currents constant –
or rather of its part depending on the system under our consideration. In this situation, using the
terminology already used in Sec. 3.5 (see also a general discussion in CM Sec. 1.4.), Uj may be called
the Gibbs potential energy of our magnetic system.
Now to exclude from Uj the contribution due to the interaction with the current-supporting
system(s), i.e. calculate the potential energy U of our system as such, we need to know this contribution.
The simplest way to do this is to use the Faraday induction law that describes this interaction and will
be discussed at the beginning of the next chapter. This is why let me postpone the derivation until that
point, and for now, ask the reader to believe me that the removal of the interaction leads to an
expression similar to Eq. (53), but with the opposite sign:
0 1 3 j(r ) j(r' ) Magnetic
U
4 2 d r d 3 r'
r r'
, (5.54) interaction
energy
Chapter 5 Page 14 of 42
Essential Graduate Physics EM: Classical Electrodynamics
I will prove this result in Sec. 6.2, but actually, this sign dichotomy should not be quite surprising to the
attentive reader, in the context of a similar duality of Eqs. (3.73) and (3.81) for the electrostatic energies
including and excluding the interaction with the field source.
Due to the importance of Eq. (54), let us rewrite it in several other forms, convenient for various
applications. First of all, just as in electrostatics, it may be recast into a potential-based form. Indeed,
with the definition (28) of the vector potential A(r), Eq. (54) becomes
1
U
2 j(r ) A(r ) d 3 r . (5.55)
This formula, which is a clear magnetic analog of Eq. (1.60) of electrostatics, is very popular among
field theorists, because it is very handy for their manipulations; it is also useful for some practical
applications. However, for many calculations, it is more convenient to have a direct expression of the
energy via the magnetic field. Again, this may be done very similarly to what had been done for
electrostatics in Sec. 1.3, i.e. by plugging into Eq. (55) the current density expressed from Eq. (35) and
then transforming it as30
1 1 1 1
U j Ad 3 r A d 3 r B d 3 r ( A B)d 3 r . (5.56)
2 2 0 2 0 2 0
Now using the divergence theorem, the second integral may be transformed into a surface integral of
(AB)n. According to Eqs. (27)-(28) if the current distribution j(r) is localized, this vector product
drops, at large distances, faster than 1/r2, so if the integration volume is large enough, the surface
integral is negligible. In the remaining first integral in Eq. (56), we may use Eq. (27) to rewrite A as
B. As a result, we get a very simple and fundamental formula.
1
U B d r.
2 3
(5.57a)
20
Just as with the electric field, this expression may be interpreted as a volume integral of the magnetic
energy density u:
1
U u r d 3 r , with u r B 2 r ,
Magnetic
field (5.57b)
energy 2 0
clearly similar to Eq. (1.65).31 Again, the conceptual choice between the spatial localization of magnetic
energy – either at the location of electric currents only, as implied by Eqs. (54) and (55), or in all regions
where the magnetic field exists, as apparent from Eq. (57b), cannot be done within the framework of
magnetostatics, and only the electrodynamics gives a decisive preference for the latter choice.
For the practically important case of currents flowing in several thin wires, Eq. (54) may be first
integrated over the cross-section of each wire, just as was done at the derivation of Eq. (4). As before,
since the integral of the current density over the kth wire's cross-section is just the current Ik in the wire,
and cannot change along its length, it may be taken from the remaining integrals, giving
30For that, we may use MA Eq. (11.7) with f = A and g = B, giving A(B) = B(A) – (AB).
31The transfer to the Gaussian units in Eqs. (57) may be accomplished by the usual replacement 0 4, thus
giving, in particular, u = B2/8.
Chapter 5 Page 15 of 42
Essential Graduate Physics EM: Classical Electrodynamics
0 1 dr drk '
U
4 2 k ,k '
Ik Ik' k , (5.58)
l l rk rk '
k k'
th
where lk is the full length of the k wire loop. Note that Eq. (58) is valid if all currents Ik are independent
of each other, because the double sum counts each current pair twice, compensating the coefficient ½ in
front of the sum. It is useful to decompose this relation as
1
U I k I k ' Lkk ' ,
2 k ,k '
(5.59)
The coefficient Lkk’ with k k’, is called the mutual inductance between current the kth and k’th
loops, while the diagonal coefficient Lk Lkk is called the self-inductance (or just inductance) of the kth
loop.32 From the symmetry of Eq. (60) with respect to the index swap, k k’, it is evident that the
matrix of coefficients Lkk’ is symmetric:33
Lkk' Lk'k , (5.61)
so for the practically most important case of two interacting currents I1 and I2, Eq. (59) reads
1 1
U L1 I 12 MI 1 I 2 L2 I 22 , (5.62)
2 2
where M L12 = L21 is the mutual inductance coefficient.
These formulas clearly show the importance of the self- and mutual inductances, so I will
demonstrate their calculation for at least a few basic geometries. Before doing that, however, let me
recast Eq. (58) into one more form that may facilitate such calculations. Namely, let us notice that for
the magnetic field induced by current Ik in a thin wire, Eq. (28) is reduced to
0 drk
A k (r ) Ik , (5.63)
4 l' r rk
so Eq. (58) may be rewritten as
1
U I k A k ' rk drk ' .
2 k ,k ' l
(5.64)
k
But according to the same Stokes theorem that was used earlier in this chapter to derive the Ampère law,
and Eq. (27), the integral in Eq. (64) is nothing else than the magnetic field’s flux (more frequently
called just the magnetic flux) through a surface S limited by the contour l :
32 As evident from Eq. (60), these coefficients depend only on the geometry of the system. Moreover, in the
Gaussian units, in which Eq. (60) is valid without the factor 0/4, the inductance coefficients have the dimension
of length (centimeters). The SI unit of inductance is called the henry, abbreviated H – after Joseph Henry, who in
particular discovered the effect of electromagnetic induction (see Sec. 6.1) independently of Michael Faraday.
33 Note that the matrix of the mutual inductances L is very similar to the matrix of reciprocal capacitance
jj’
coefficients pkk’ – for example, compare Eq. (62) with Eq. (2.21).
Chapter 5 Page 16 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Magnetic
flux Ar dr A n d 2 r Bn d 2 r Φ (5.65)
l S S
– in the particular case of Eq. (64), the flux kk’ of the field induced by the k’th current through the loop
of the kth current.34 As a result, Eq. (64) may be rewritten as
1
U I k Φ kk ' . (5.66)
2 k ,k '
Comparing this expression with Eq. (59), we see that
This expression not only gives us one more means for calculating the coefficients Lkk’, but also
shows their physical sense: the mutual inductance characterizes what part of the magnetic flux
(colloquially, “what fraction of field lines”) induced by the current Ik' pierces the kth loop’s area Sk – see
Fig. 7.
Sk"
S k'''
Bk'
kk' Ik'
Due to the linear superposition principle, the total flux piercing the kth loop may be represented as
Magnetic
flux from Φ k Φ kk ' Lkk ' I k ' (5.68)
currents k' k'
For example, for the system of two currents, this expression is reduced to a clear analog of Eqs. (2.19):
Φ1 L1 I 1 MI 2 ,
(5.69)
Φ 2 MI 1 L2 I 2 .
For the even simpler case of a single current,
of a
single LI , (5.70)
current
so the magnetic energy of the current may be represented in several equivalent forms:
34 The SI unit of magnetic flux is called weber, abbreviated Wb – after Wilhelm Edward Weber (1804-1891), who
in particular co-invented (with Carl Gauss) the electromagnetic telegraph. More importantly for this course, in
1856 he was the first (together with Rudolf Kohlrausch) to notice that the value of (in modern terms) 1/(00)1/2,
derived from electrostatic and magnetostatic measurements, coincides with the independently measured speed of
light c. This observation gave an important motivation for Maxwell’s theory.
Chapter 5 Page 17 of 42
Essential Graduate Physics EM: Classical Electrodynamics
U of a
L 2 1 1 2
U I IΦ Φ . (5.71) single
2 2 2L current
These relations, similar to Eqs. (2.14)-(2.15) of electrostatics, show that the self-inductance L of a
current loop may be considered a measure of the system’s magnetic energy. However, as we will see in
Sec. 6.1, this measure is adequate only if the flux , rather than the current I, is fixed.
Now we are well equipped for the calculation of inductance coefficients for particular systems,
having three options. The first one is to use Eq. (60) directly.35 The second one is to calculate the
magnetic field energy from Eq. (57) as the function of all currents Ik in the system, and then use Eq. (59)
to find all coefficients Lkk’. For example, for a system with just one current, Eq. (71) yields
U
L 2
. (5.72)
I /2
Finally, if the system consists of thin wires, so the loop areas Sk and hence the fluxes kk’ are well
defined, we may calculate them from Eq. (65), and then use Eq. (67) to find the inductances.
Usually, the third option is simpler, but the first two may be very useful even for thin-wire
systems, especially if the notion of magnetic flux in them is not quite apparent. As an important
example, let us find the self-inductance of a long solenoid – see Fig. 6a again. We have already
calculated the magnetic field inside it – see Eq. (40) – so, due to the field uniformity, the magnetic flux
piercing each turn of the wire is just
1 BA 0 nIA , (5.73)
where A is the area of the solenoid’s cross-section – for example R2 for a round solenoid, though Eq.
(40), and hence Eq. (73) are valid for cross-sections of any shape. Comparing Eqs. (73) with Eq. (70),
one might wrongly conclude that L = 1/I = 0nA (WRONG!), i.e. that the solenoid’s inductance is
independent of its length. Actually, the magnetic flux 1 pierces each wire turn, so the total flux through
the whole current loop, consisting of N turns, is
N1 0 n 2lAI , (5.74)
35Numerous applications of that Neumann formula (derived in 1845 by F. Neumann) to electrical engineering
problems may be found, for example, in the classical text by F. Grover, Inductance Calculations, Dover, 1946.
Chapter 5 Page 18 of 42
Essential Graduate Physics EM: Classical Electrodynamics
This energy-based approach becomes virtually inevitable for continuously distributed currents.
As an example, let us calculate the self-inductance L of a long coaxial cable with the cross-section
shown in Fig. 8, 36 and the full current in the outer conductor equal and opposite to that (I) in the inner
conductor.
a
b
0 c
I
Fig. 5.8. The cross-section of a coaxial cable.
I
Let us assume that the current is uniformly distributed over the cross-sections of both
conductors. (As we know from the previous chapter, this is indeed the case if both the internal and
external conductors are made of a uniform resistive material.) First, we should calculate the radial
distribution of the magnetic field – which has only one, azimuthal component because of the axial
symmetry of the problem. This distribution may be immediately found by applying the Ampère law (37)
to circular contours of radii within four different ranges:
2 a2 , for ρ a,
1, for a ρ b,
2 B 0 I 0 I 2 (5.77)
piercing the contour
c
2
c b , for b c,
2 2
0, for c .
Now, an easy integration yields the magnetic energy per unit length of the cable:
0 I 2 a
2 2 2
U 1 2 b
1 c
c2 2
2 0 0 0 4 0 a 2 a b (c 2 b 2 )
B 2 2
d r B d d d d
l
(5.78)
b c2 c2 c 1 I 2
0 ln 2 ln .
2 a c b 2 c 2 b 2 b 2 2
L 0 b c2 c2 c 1
ln 2 ln . (5.79)
l 2 a c b 2 c 2 b 2 b 2
Note that for the particular case of a thin outer conductor, c – b << b, this expression reduces to
L 0 b 1
ln , (5.80)
l 2 a 4
where the first term in the parentheses is due to the contribution of the magnetic field energy in the free
space between the conductors. This distinction is important for some applications because in
36 As a reminder, the mutual capacitance C between the conductors of such a system was calculated in Sec. 2.3.
Chapter 5 Page 19 of 42
Essential Graduate Physics EM: Classical Electrodynamics
superconductor cables, as well as the normal-metal cables at high frequencies (to be discussed in the
next chapter), the field does not penetrate the conductor’s bulk, so Eq. (80) is valid without the last term
¼ in the parentheses – for any b < c.
As the last example, let us calculate the mutual inductance between a long straight wire and a
round wire loop adjacent to it (Fig. 9), neglecting the thickness of both wires.
y
I2
Here there is no problem with using the last approach discussed above, based on the direct
calculation of the magnetic flux. Indeed, as was discussed in Sec. 1, the field B1 induced by the current
I1 at any point of the round loop is normal to its plane – e.g., to the plane of the drawing of Fig. 9. In the
Cartesian coordinates shown in that figure, Eq. (20) reads B1 = 0I1/2y, giving the following magnetic
flux through the loop:
I R
R R 2 x 2
1/ 2
dy 0 I 1
R
R R2 x2
1/ 2
0 I 1 R 1 1 1 2
1/ 2
Φ 21 0 1
2 dx 1/ 2 y
0 R R 2 x 2
ln
1/ 2
dx
0 1 1 2 1 / 2
ln d . (5.81)
R
R R 2 x 2
This is a table integral equal to ,37 so 21 = 0I1R, and the final answer for the mutual inductance M
L12 = L21 = 21/I1 is finite (and very simple):
M 0 R , (5.82)
despite the magnetic field's divergence at the lowest point of the loop (y = 0).
Note that in contrast with the finite mutual inductance of this system, the self-inductances of both
its wires are formally infinite in the thin-wire limit – see, e.g., Eq. (80), which, in the limit b/a >> 1,
describes a thin straight wire. However, since this divergence is very weak (logarithmic), it is quenched
by any deviation from this perfectly axial geometry. For example, a fair estimate of the inductance of a
wire of a large but finite length l >> a may be obtained from Eq. (80) by the replacement of b with l:
0l l
L ln . (5.83)
2 a
(Note, however, that the exact result depends on where from/to the current flows beyond that segment.)
It turns out that a similar approximate result, with l replaced with 2R in the front factor, and with R
under the logarithm, is valid for the self-inductance of a round loop with a << R. (A proof of this fact is
a very useful exercise, highly recommended to the reader.)
Chapter 5 Page 20 of 42
Essential Graduate Physics EM: Classical Electrodynamics
j(r' ) r
r'
a
0
Fig. 5.10. Calculating the magnetic field of localized
V
currents at a distant point (r >> a).
Applying the truncated Taylor expansion (3.5) of the fraction 1/r – r’ to the vector potential
given by Eq. (28), we get
1 1
A(r ) 0 j(r' )d 3 r' 3 r r' j(r' )d 3 r' . (5.84)
4 r V r V
Now, due to the vector character of this potential, we have to depart somewhat from the approach of
Sec. 3.1 and use the following vector algebra identity:38
f ( j g ) g ( j f ) d r 0,
3
(5.85)
V
that is valid for any pair of smooth (differentiable) scalar functions f(r) and g(r), and any vector function
j(r) that, as the dc current density, satisfies the continuity condition j = 0 and whose normal
component vanishes on the surface of the volume V. First, let us use Eq. (85) with f equal to 1, and g
equal to any Cartesian component of the radius-vector r: g = rl (l = 1, 2, 3). Then it yields
( j n )d r j l d 3 r 0 ,
3
l (5.86)
V V
showing that the first term on the right-hand side of Eq. (84) equals zero. Next, let us use Eq. (85) again,
but now with f = rl, g = rl’ (where l, l’ = 1, 2, 3); then it yields
r j
V
l l' rl ' jl d 3 r 0 , (5.88)
so the lth Cartesian component of the second integral in Eq. (84) may be transformed as
38 See, e.g., MA Eq. (12.3) with the additional condition jnS = 0, pertinent to space-restricted currents.
Chapter 5 Page 21 of 42
Essential Graduate Physics EM: Classical Electrodynamics
3
1 3
(r r' ) jl d r' rl' r' l' jl d r' rl' (r' l' jl r' l' jl )d 3 r'
3 3
V V l' 1
2 l' 1 V
(5.89)
1 3 1
rl' (r' l' jl r' l' jl' )d 3 r' r (r' j)d 3 r' .
2 l' 1 V 2 V l
As a result, Eq. (84) may be rewritten as
0 m r
A(r ) , (5.90)
4 r 3 Magnetic
where the vector m, defined as39 dipole and
its potential
1
m
2V r j(r ) d 3 r , (5.91)
is called the magnetic dipole moment of a field source – which itself, within the long-range
approximation (90), is called the magnetic dipole.
Note a close analogy between the m defined by Eq. (91), and the orbital40 angular momentum of
a non-relativistic particle with mass mk:
L k rk p k rk mk v k , (5.92)
where pk = mkvk is its linear momentum. Indeed, for a continuum of such particles with equal electric
charges q, distributed with spatial density n, we have j = qnv, and Eq. (91) yields
1 nq
m r j d 3r r v d 3r , (5.93)
V
2 V
2
while the total angular momentum of such a system of particles of equal masses m0, is
L nm0r vd 3r ,
V
so we get a very straightforward relation
q
m L. (5.95) m vs. L
2m0
For the orbital motion, this classical relation survives in quantum mechanics for linear operators,
and hence for eigenvalues of the observables. Since the orbital angular momentum is quantized in the
units of the Planck constant , the orbital magnetic moment of an electron is always a multiple of the so-
called Bohr magneton
e
B , (5.96) Bohr
magneton
2m e
where me is the free electron mass.41 However, for particles with spin, such a universal relation between
the vectors m and L is no longer valid. For example, the electron’s spin s = ½ gives a contribution of /2
to its mechanical angular momentum, but a contribution very close to B to its magnetic moment.
39 In the Gaussian units, the definition (91) is kept valid “as is”, so Eq. (90) is stripped of the factor 0/4.
40 This adjective is used, especially in quantum mechanics, to distinguish the motion of a particle as a whole (not
necessarily along a closed orbit!) from its intrinsic angular momentum, the spin – see, e.g., QM Chapters 3-6.
41 In the SI units, m 0.9110-30 kg, so 0.9310-23 J/T.
e B
Chapter 5 Page 22 of 42
Essential Graduate Physics EM: Classical Electrodynamics
The next important example of a magnetic dipole is a planar thin-wire loop, limiting area A (of
arbitrary shape), and carrying current I, for which m has a surprisingly simple form,
m IA , (5.97)
where the modulus of the vector A equals the loop’s area A, and its direction is normal to the loop’s
plane. This formula may be readily proved by noticing that if we select the coordinate frame origin on
the plane of the loop (Fig. 11), then the elementary component of the magnitude of the integral (91),
1 1 1
dm
2 r I dr I
C C
2
r dr I r 2 d ,
C
2
(5.98)
is just the elementary area dA = (1/2)rd(r) = r2d/2 – the equality already used in CM Eq. (3.40).
A rd
C r
dA
The comparison of Eqs. (96) and (97) allows a useful estimate of atomic currents, by finding
what current I should flow in a circular loop of the atomic size scale (the Bohr radius) rB ~ 0.510-10 m,
i.e. of area A ~ 10-20 m2, to produce a magnetic moment of the order of B.42 The result is surprisingly
macroscopic: I ~ 1 mA – quite comparable to the current driving the sound in your phone’s earbuds.
Though due to the quantum-mechanical spread of electron wavefunctions, this estimate should not be
taken too literally, it is very useful for getting a gut feeling of how significant the atomic magnetism is,
and hence why ferromagnets may provide such strong magnetic fields.
After these illustrations, let us return to the discussion of the general Eq. (90). Plugging it into
(also general) Eq. (27), we may calculate the magnetic field of a magnetic dipole: 43
42 Another way to arrive at the same estimate is to take I ~ ef = e/2 with ~ 1016 s-1 being the typical
frequency of radiation due to atomic interlevel quantum transitions.
43 Similarly to the situation with the electric dipoles (see Eq. (3.24) and its discussion), it may be shown that the
magnetic field of any closed current loop (or any system of such loops) satisfies the following equality:
B(r)d
3
r 2 / 3 0 m ,
V
where the integral is over any sphere confining all the currents. On the other hand, as we know from Sec. 3.1, for
a field with the structure (99), derived from the long-range approximation (90), such an integral vanishes. As a
result, to get a coarse-grain description of the magnetic field of a small system located at r = 0, that would give the
correct average value of the magnetic field, Eq. (99) should be modified as follows:
0 3r (r m) mr 2 8
B cg (r ) m r ,
4 r 5
3
in a conceptual (though not quantitative) similarity to Eq. (3.25).
Chapter 5 Page 23 of 42
Essential Graduate Physics EM: Classical Electrodynamics
0 3r (r m) mr 2 Magnetic
B(r ) . (5.99) dipole’s
4 r5 field
The structure of this formula exactly replicates that of Eq. (3.13) for the electric dipole field – including
the sign). Because of this similarity, the energy of a dipole of a fixed magnitude m in an external field,
and hence the torque and the force exerted on it by a fixed external field, are given by expressions fully
similar to those for an electric dipole – see Eqs. (3.15)-(3.19):44
Magnetic
dipole
U m B ext , (5.100) in external
field
and as a result,
τ m B ext , (5.101)
F (m B ext ) . (5.102)
Now let us consider a system of many magnetic dipoles (e.g., atoms or molecules), distributed in
space with an atomic-scale-averaged density n. Then we can use Eq. (90) generalized in an evident way
for an arbitrary position r’ of the dipole, and the linear superposition principle, to calculate the
macroscopic vector potential A:
M (r' ) (r r' ) 3
A(r ) 0 d r' , (5.103)
4 r r'
3
where M nm is the magnetization: the average magnetic moment per unit volume. Transforming this
integral absolutely similarly to how Eq. (3.27) had been transformed into Eq. (3.29), we get:
0 ' M (r' ) 3
4 r r'
A(r ) d r' . (5.104)
Comparing this result with Eq. (28), we see that M is equivalent, in its magnetic effect, to the
density jef of a certain effective “magnetization current”. Just as the electric-polarization charge ef
discussed in Sec. 3.2 (see Fig. 3.4), the vector jef = M may be interpreted as the uncompensated part
of the loop currents representing single magnetic dipoles m – see Fig. 12. Note, however, that since the
atomic magnetic dipoles may be due to particles’ spins, rather than the actual electric currents due to the
orbital motion, the magnetization current’s nature is not as direct as that of the polarization charge.
jef
44 Note that the fixation of m and Bext effectively means that the currents producing them are fixed – please have
one more look at Eqs. (35) and (97). As a result, Eq. (100) is a particular case of Eq. (53) rather than (54) – hence
the minus sign.
Chapter 5 Page 24 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Now, using Eq. (28) to add the possible contribution from the stand-alone currents j not
included in the currents of microscopic magnetic dipoles, we get the general expression for the vector
potential of the macroscopic field:
j(r' ) ' M (r' ) 3
A(r ) 0 d r' . (5.105)
4 r r'
Repeating the calculations that have led us from Eq. (28) to the Maxwell equation (35), with the account
of the magnetization current term, for the macroscopic magnetic field B we get
B 0 j M . (5.106)
Following the same reasoning as in Sec. 3.2, we may recast this equation as
H j, (5.107)
where the field defined as
B
Magnetic H M, (5.108)
field H 0
for historic reasons (and very unfortunately) is also called the magnetic field.45 This is why it is crucial
to remember that the physical sense of field H is very much different from field B. To understand this
difference better, let us use Eq. (107) to bring Eqs. (3.32), (3.36), (29), and (107) together, writing them
as the following system of macroscopic Maxwell equations (again, so far for the stationary case /t =
0):46
Stationary E 0, H j,
macroscopic (5.109)
Maxwell D , B 0.
equations
These equations clearly show that the roles of the vector fields D and H are very similar: they both may
be called “would-be fields” – meaning the fields that would be induced by the stand-alone charges and
currents j, if the medium had not modified them by its dielectric and magnetic polarization.
Despite this similarity, let me note an important difference of signs in the relation (3.33) between
E, D, and P, on one hand, and the relation (108) between B, H, and M, on the other hand. This is not
just a matter of definition. Indeed, due to the similarity of Eqs. (3.15) and (100), including similar signs,
the electric and magnetic fields both try to orient the corresponding dipole moments along the field.
Hence, in the media that allow such an orientation (and as we will see momentarily, for magnetic media
it is not always the case), the induced polarizations P and M are directed along, respectively, the vectors
E and B of the genuine (though macroscopic, i.e. atomic-scale-averaged) fields. According to Eq. (3.33),
if the would-be field D is fixed – say, by a fixed stand-alone charge distribution (r) – such a
polarization reduces the electric field E = (D – P)/0. On the other hand, Eq. (108) shows that in a
magnetic media with a fixed would-be field H, the magnetic polarization making M parallel to B,
45 This confusion is exacerbated by the fact that in Gaussian units, Eq. (108) has the form H = B – 4M, and
hence the fields B and H have the same dimensionality (and are formally equal in free space!) – though the unit of
H has a different name (oersted, abbreviated as Oe). Mercifully, in the SI units, the dimensionality of B and H is
different, with the unit of H called the ampere per meter.
46 Let me remind the reader once again that in contrast with the system (36) of the Maxwell equations for the
genuine (microscopic) fields, the right-hand sides of Eqs. (109) represent only the stand-alone charges and
currents, not included in the microscopic electric and magnetic dipoles.
Chapter 5 Page 25 of 42
Essential Graduate Physics EM: Classical Electrodynamics
enhances the magnetic field B = 0(H + M). This difference may be traced back to the sign difference in
the basic relations (1) and (2), i.e. to the fundamental fact that the electric charges of the same sign
repulse, while the currents of the same direction attract each other.
Plugging these relations into Eq. (108), we see that these two parameters are not independent, but are
related as
(1 m ) 0 . (5.112) m vs.
Note that despite the superficial similarity between Eqs. (110)-(112) and the corresponding
relations (3.43)-(3.47) for linear dielectrics:
D E, P e 0 E, 1 e 0 , (5.113)
there is an important conceptual difference between them. Namely, while the vector E on the right-hand
sides of Eqs. (113) is the actual (though macroscopic) electric field, the vector H on the right-hand side
of Eqs. (110)-(111) represents a “would-be” magnetic field, in most aspects similar to D rather than E –
see, for example, Eqs. (109). This historic difference in the traditional form of the constitutive relations
for the electric and magnetic fields is not without its physical reasons. Most experiments with electric
and magnetic materials are performed by placing their samples into nearly uniform electric and
magnetic fields, and the simplest systems for their implementation are, respectively, plane capacitors
(Fig. 2.3) and long solenoids (Fig. 6). The field in the former system may be most conveniently
47 According to Eqs. (110) and (112), i.e. in the SI units, m is dimensionless, while has the same dimensionality
as 0. In the Gaussian units, is dimensionless: ()Gaussian = ()SI/0, and m is also introduced differently, as = 1
+ 4m, Hence, just as for the electric susceptibilities, these dimensionless coefficients are different in the two
systems: (m )SI = 4(m)Gaussian. Note also that m is formally called the volumic magnetic susceptibility, to
distinguish it from the atomic (or “molecular”) susceptibility defined by a similar relation, m H, where m
is the induced magnetic moment of a single dipole – e.g., an atom. ( is an analog of the electric atomic
polarizability – see Eq. (3.48) and its discussion.) In a dilute medium, i.e. in the absence of substantial dipole-
dipole interactions, m = n, where n is the dipole density.
Chapter 5 Page 26 of 42
Essential Graduate Physics EM: Classical Electrodynamics
controlled by fixing the voltage V between its plates, which is proportional to the electric field E. On the
other hand, the field provided by the solenoid may be fixed by the current I in it, and according to Eq.
(107), the field proportional to this stand-alone current is H, rather than B.48
Table 1 lists the approximate magnetic susceptibility values for several materials. It shows that
in contrast to linear dielectrics whose susceptibility e is always positive, i.e. the dielectric constant =
e + 1 is always larger than 1 (see Table 3.1), linear magnetic materials may be either paramagnets
(with m > 0, i. e. > 0) or diamagnets (with m < 0, i.e. < 0).
Table 5.1. Susceptibility (m)SI of a few representative and/or important magnetic materials(a)
The reason for this difference is that in dielectrics, two different polarization mechanisms
(schematically illustrated by Fig. 3.7) lead to the same sign of the average polarization – see the
discussion in Sec. 3.3. One of these mechanisms, illustrated by Fig. 3.7b, i.e. the ordering of
spontaneous dipoles by the applied field, is also possible for magnetization – for the atoms and
molecules with spontaneous internal magnetic dipoles of magnitude m0 ~ B, due to their net spins.
Again, in the absence of an external magnetic field the spins, and hence the dipole moments m0 may be
disordered, but according to Eq. (100), the external magnetic field tends to align the dipoles along its
direction. As a result, the average direction of the spontaneous elementary moments m0, and hence the
direction of the arising magnetization M, is the same as that of the microscopic field B at the points of
the dipole location (i.e., for a diluted media, of H B/0), resulting in a positive susceptibility m, i.e. in
the paramagnetism, such as that of oxygen and aluminum – see Table 1.
48 This fact also explains the misleading term “magnetic field” for H.
Chapter 5 Page 27 of 42
Essential Graduate Physics EM: Classical Electrodynamics
B
L sin
dL / dt L
v
0 m0 , q
Fig. 5.13. The torque-induced precession of a
m classical charged particle in a magnetic field.
A simple estimate (also left for the reader’s exercise) shows that in atoms with spontaneous non-
zero net spins, the magnetic dipole orientation mechanism prevails over the orbital diamagnetism, so the
materials incorporating such atoms usually exhibit net paramagnetism – see Table 1. Due to possible
strong quantum interaction between the spin dipole moments, the magnetism of such materials is rather
complex, with numerous interesting phenomena and elaborate theories. Unfortunately, all this physics is
well outside the framework of this course, and I have to refer the interested reader to special literature,52
but still will mention some key facts.
49 Named after Sir Joseph Larmor who was the first (in 1897) to describe this effect mathematically.
50 For a detailed discussion of this effect see, e.g., CM Sec. 4.5.
51 See, e.g., QM Sec. 6.4. Quantum mechanics also explains why in most common (s-) ground states, the average
contribution (95) of the orbital angular momentum L to the net vector m vanishes.
52 See, e.g., D. J. Jiles, Introduction to Magnetism and Magnetic Materials, 2nd ed., CRC Press, 1998, or R. C.
O’Handley, Modern Magnetic Materials, Wiley, 1999.
Chapter 5 Page 28 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Most importantly, a sufficiently strong magnetic dipole-dipole interaction may lead to their
spontaneous ordering, even in the absence of the applied field. This ordering may correspond to either
parallel alignment of the dipoles (ferromagnetism) or anti-parallel alignment of the adjacent dipoles
(antiferromagnetism). Evidently, the external effects of ferromagnetism are stronger, because this phase
corresponds to a substantial spontaneous magnetization M even in the absence of an external magnetic
field. (The corresponding magnitude of B = 0M is called the remanence field, BR.) The direction of the
vector BR may be switched by the application of an external magnetic field, with a magnitude above a
certain value HC called coercivity, leading to the well-known hysteretic loops on the [H, B] plane (see
Fig. 14 for a typical example) – similar to those in ferroelectrics, already discussed in Sec. 3.3.
Just as the ferroelectrics, the ferromagnets may also be hard or soft – in the magnetic rather than
mechanical sense. In hard ferromagnets (also called permanent magnets), the dipole interaction is so
strong that B stays close to BR in all applied fields below HC, so the hysteretic loops are virtually
rectangular. Hence, in lower fields, the magnetization M of a permanent magnet may be considered
constant, with the magnitude BR/0. Such hard ferromagnetic materials (notably, rare-earth compounds
such as SmCo5, Sm2Co17, and especially Nd2Fe14B), with high remanence fields (~1 T) and high
coercivity (~106 A/m), have numerous practical applications.53 Let me give just two, most important
examples.
First, permanent magnets are the core components of most electric motors. By the way, this
venerable (~150-years-old) technology is currently experiencing a quiet revolution, driven mostly by the
electric car development. In the most advanced type of motors, called permanent-magnet synchronous
machines (PMSM), the remanence magnetic field BR of a permanent-magnet rotating part of the
machine (called the rotor) interacts with the magnetic field of ac currents passed through wire windings
in the external, static part of the motor (called the stator). The resulting torque may drive the rotor to
extremely high speeds, exceeding 10,000 rotations per minute, enabling the motor to deliver several
kilowatts of mechanical power from each kilogram of its mass.
As the second important example, despite the decades of the exponential (Moore’s-law) progress
of semiconductor electronics, most computer data storage systems (e.g., in data centers) are still based
53Currently, the neodymium-iron-boron compound holds nearly 95% percent of the world permanent-magnet
application market, due to its combination of high BR and HC with lower fabrication costs.
Chapter 5 Page 29 of 42
Essential Graduate Physics EM: Classical Electrodynamics
on hard disk drives whose active media are submicron-thin layers of hard ferromagnets, with the data
bits stored in the form of the direction of the remanent magnetization of small film spots. This
technology has reached fantastic sophistication, with the recorded data density of the order of 1012 bits
per square inch.54 (Only recently it started to be seriously challenged by solid-state drives based on the
floating-gate semiconductor memories already mentioned in Chapter 3.) 55
In contrast, in soft ferromagnets, with their lower magnetic dipole interactions, the magnetization
is constant only inside each of the spontaneously formed magnetic domains, while the volume and shape
of the domains are affected by the applied magnetic field. As a result, the hysteresis loop’s shape of soft
ferromagnets is dependent on the cycled field’s amplitude and cycling history – see Fig. 14. At high
fields, their B (and hence M) is driven into saturation, with B BR, but at low fields, they behave
essentially as linear magnetics with very high values of m and hence – see the top rows of Table 1.
(The magnetic domain interaction, and hence the low-field susceptibility of such soft ferromagnets are
highly dependent on the material’s fabrication technology and its post-fabrication thermal and
mechanical treatments.) Due to these high values of , soft ferromagnets, especially iron and its alloys
(e.g., various special steels), are extensively used in electrical engineering – for example in the cores of
transformers – see the next section.
Due to the relative weakness of the magnetic dipole interaction in some materials, their
ferromagnetic ordering may be destroyed by thermal fluctuations, if the temperature is increased above
some value called the Curie temperature TC, specific for each material. The transition between the
ferromagnetic and paramagnetic phases at T = TC is a classical example of a continuous phase
transition, with the average polarization M playing the role of the so-called order parameter that (in the
absence of external fields) becomes different from zero only at T < TC, increasing gradually at the
further temperature reduction.56
54 “A magnetic head slider [the read/write head – KKL] flying over a disk surface with a flying height of 25 nm
with a relative speed of 20 meters/second [all realistic parameters – KKL] is equivalent to an aircraft flying at a
physical spacing of 0.2 µm at 900 kilometers/hour.” B. Bhushan, as quoted in the (generally good) book by G.
Hadjipanayis, Magnetic Storage Systems Beyond 2000, Springer, 2001.
55 The high-frequency properties of hard ferromagnets are also very non-trivial. For example, according to Eq.
(101), an external magnetic field Bext exerts torque = MBext on the spontaneous magnetic moment M of a unit
volume of a ferromagnet. In some nearly-isotropic, mechanically fixed ferromagnetic samples, this torque causes
the precession, around the direction of Bext (very similar to that illustrated in Fig. 13), of not the sample as such,
but of the magnetization M inside it, with a certain frequency r. If the frequency of an additional ac field
becomes very close to r, its absorption sharply increases – the so-called ferromagnetic resonance. Moreover, if
is somewhat higher than r, the effective magnetic permeability () of the material for the ac field may become
negative, enabling a series of interesting effects and practical applications. Very unfortunately, I could not find
time for their discussion in this series and have to refer the interested reader to literature, for example the
monograph by A. Gurevich and G. Melkov, Magnetization Oscillations and Waves, CRC Press, 1996.
56 In this series, a quantitative discussion of such transitions is given in SM Chapter 4.
Chapter 5 Page 30 of 42
Essential Graduate Physics EM: Classical Electrodynamics
the genuine (“microscopic”) Maxwell equations (36) in free space, i.e. when the genuine current density
j coincides with that of stand-alone currents. Then the macroscopic Maxwell equations (109) and the
linear constitutive equation (110) are satisfied with the pair of functions
B 0 r
H r , Br H r B 0 r . (5.115)
0 0
Hence the only effect of the complete filling of a fixed-current system with a uniform, linear
magnetic medium is the change of the magnetic field B at all points by the same constant factor /0 1
+ m, which may be either larger or smaller than 1. (As a reminder, a similar filling of a system of fixed
stand-alone charges with a uniform, linear dielectric always leads to a reduction of the electric field E by
a factor of /0 1 + e – the difference whose physics was already discussed at the end of Sec. 4.)
However, this simple result is generally invalid in the case of nonuniform (or piecewise-uniform)
magnetic samples. To analyze this case, let us first integrate the macroscopic Maxwell equation (107)
along a closed contour C limiting a smooth surface S. Now using the Stokes theorem just as at the
derivation of Eq. (37), we get the macroscopic version of the Ampère law (37):
H dr I .
Macroscopic
Ampère
(5.116)
law C
Let us apply this relation to a sharp boundary between two regions with different magnetic
materials, with no stand-alone currents on the interface, similarly to how this was done for the field E in
Sec. 3.4 – see Fig. 3.5. The result is similar as well:
H const . (5.117)
On the other hand, the integration of the Maxwell equation (29) over a Gaussian pillbox enclosing a
border fragment (again just as shown in Fig. 3.5 for the field D) yields a result similar to Eq. (3.35):
Bn const . (5.118)
For linear magnetic media, with B = H, the latter boundary condition is reduced to
H n const . (5.119)
Let us use these boundary conditions, first of all, to see what happens with a long cylindrical
sample of a uniform magnetic material, placed parallel to a uniform external magnetic field B0 – see Fig.
15. Such a sample cannot noticeably disturb the field in the free space outside it, at most of its length:
Bext = B0, Hext = 0Bext= 0B0. Now applying Eq. (117) to the dominating surfaces of the sample, we get
Hint = H0.57 For a linear magnetic material, these relations yield Bint = Hint = (/0) B0.58 For the high-
media, this means that Bint >> B0. This effect may be vividly represented as the concentration of the
magnetic field lines in high- samples – see Fig. 15 again. (The concentration affects the external field
57 The independence of H on magnetic properties of the sample in this geometry explains why this field’s
magnitude is commonly used as the argument in the plots like Fig. 14: such measurements are typically carried
out by placing an elongated sample of the material under study into a long solenoid with a controllable current I,
so according to Eq. (116), H0 = nI, regardless of the sample.
58 The reader is highly encouraged to carry out a similar analysis of the fields inside narrow gaps cut in a linear
magnetic material, similar to that carried in Sec. 3.3 out for linear dielectrics – see Fig. 3.6 and its discussion.
Chapter 5 Page 31 of 42
Essential Graduate Physics EM: Classical Electrodynamics
distribution only at distances of the order of (/0)t << l near the sample’s ends.) Such concentration is
widely used in such practically important devices as transformers, in which two multi-turn coils are
wound on a ring-shaped (e.g., toroidal, see Fig. 6b) core made of a soft ferromagnetic material (such as
the transformer steel, see Table 1) with >> 0. This minimizes the number of “stray” field lines, and
makes the magnetic flux piercing each wire turn of either coil virtually the same – the equality
important for the secondary voltage induction – see the next chapter.
B0
B / 0 B 0 B 0
t l
~ t l
0
l
Fig. 5. 15. Magnetic field concentration in long, high- magnetic samples (schematically).
Samples of other geometries may create strong perturbations of the external field, extended to
distances of the order of the sample’s dimensions. To analyze such problems, we may benefit from a
simple, partial differential equation for a scalar function, e.g., the Laplace equation, because in Chapter
2 we have learned how to solve it for many simple geometries. In magnetostatics, the introduction of a
scalar potential is generally impossible due to the vortex-like magnetic field lines. However, if there are
no stand-alone currents within the region we are interested in, then the macroscopic Maxwell equation
(107) for the field H is reduced to H = 0, similar to Eq. (1.28) for the electric field, showing that we
may introduce the scalar potential of the magnetic field, m, using a relation similar to Eq. (1.33):
H – m . (5.120)
Combining it with the homogenous Maxwell equation (29) for the magnetic field, B = 0, and Eq.
(110) for a linear magnetic material, we arrive at a single differential equation, (m) =0. For a
uniform medium ((r) = const), it is reduced to our beloved Laplace equation:
2 m 0 . (5.121)
Moreover, Eqs. (117) and (119) give us very familiar boundary conditions: the first of them
m
const , (5.122a)
being equivalent to
m const , (5.122b)
while the second one giving
m
const . (5.123)
n
Indeed, these boundary conditions are absolutely similar for (3.37) and (3.56) of electrostatics, with the
replacement .59
59 This similarity may seem strange because earlier we have seen that the parameter is physically more similar
to 1/. The reason for this paradox is that in magnetostatics, the magnetic potential m is traditionally used to
Chapter 5 Page 32 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Let us analyze the geometric effects on magnetization, first using the (too?) familiar structure: a
sphere, made of a linear magnetic material, placed into a uniform external field H0 B0/0. Since the
differential equation and the boundary conditions are similar to those of the corresponding electrostatics
problem (see Fig. 3.11 and its discussion), we can use the above analogy to reuse the solution we
already have – see Eqs. (3.63). Just as in the electric case, the field outside the sphere, with
0 R3
m r R
H0 r cos ,
2
(5.124)
2 0 r
is a sum of the uniform external field H0, with the potential –H0rcos –H0z, and the dipole field (99)
with the following induced magnetic dipole moment of the sphere:60
0 3
m 4 R H0 . (5.125)
2 0
On the contrary, the internal field is perfectly uniform, and directed along the external one:
3 0 H int 3 0 Bint H int 3
m r R H 0 r cos , so that , . (5.126)
2 0 H0 2 0 B0 0 H 0 2 0
Note that the field Hint inside the sphere is not equal to the applied external field H0. This
example shows that the interpretation of H as the “would-be” magnetic field generated by external
stand-alone currents j should not be exaggerated by saying that its distribution is independent of the
magnetic bodies in the system. In the limit >> 0, Eqs. (126) yield Hint/H0 << 1, Bint/H0 = 30, the
factor 3 being specific for the particular geometry of the sphere. If a sample is strongly stretched along
the applied field, with its length l much larger than the scale t of its cross-section, this geometric effect is
gradually decreased, and Bint tends to its value H0 >> B0, as was discussed above – see Fig. 15.
Now let us calculate the field distribution in a similar, but slightly more complex (and practically
important) system: a round cylindrical shell, made of a linear magnetic material, placed into a uniform
external field H0 normal to its axis – see Fig. 16.
H0 y sin
b
H int a
0 x cos
describe the “would-be field” H, while in electrostatics, the potential describes the actual electric field E. (This
tradition persists from the days when H was perceived as a genuine magnetic field.)
60 To derive Eq. (125), we may either calculate the gradient of the given by Eq. (124), or use the similarity of
m
Eqs. (3.13) and (99), to derive from Eq. (3.17) a similar expression for the magnetic dipole’s potential:
1 m cos
m .
4 r 2
Now comparing this formula with the second term of Eq. (124), we immediately get Eq. (125).
Chapter 5 Page 33 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Since there are no stand-alone currents in the region of our interest, we can again represent the
field H(r) by the gradient of the magnetic potential m – see Eq. (120). Inside each of three constant-
regions, i.e. at < b, a < < b, and b < (where is the 2D distance from the cylinder's axis), the
potential obeys the Laplace equation (121). In the convenient, polar coordinates (see Fig. 16), we may,
guided by the general solution (2.112) of the Laplace equation and our experience in its application to
axially-symmetric geometries, look for m in the following form:
H 0 b1' / cos , for b ,
m a1 b1 / cos , for a b, (5.127)
H cos , for a .
int
Plugging this solution into the boundary conditions (122)-(123) at both interfaces ( = b and
= a), we get the following system of four equations:
H 0 b b1' / b a1b b1 / b, a1a b1 / a H int a,
(5.128)
0 H 0 b1' / b 2 H 0 a1 b1 / b 2 , a1 b1 / a 2 0 H int ,
for four unknown coefficients a1, b1, b1’, and Hint. Solving the system, we get, in particular:
2
H int c 1 0
, with c . (5.129)
H 0 c a / b 2 0
According to these formulas, at > 0, the field in the free space inside the cylinder is lower
than the external field. This fact allows using such structures, made of high- materials such as
permalloy (see Table 1), for passive shielding61 from unintentional magnetic fields (e.g., the Earth's
field) – the task very important for the design of many physical experiments. As Eq. (129) shows, the
larger is , the closer is c to 1, and the smaller is the ratio Hint/H0, i.e. the better is the shielding, for the
same a/b ratio. On the other hand, for a given magnetic material, i.e. for a fixed parameter c, the
shielding is improved by making the ratio a/b < 1 smaller, i.e. the shield thicker. On the other hand, as
Fig. 16 shows, smaller a leaves less space for the shielded samples, calling for a compromise.
Note that in the limit /0 , both Eq. (126) and Eq. (129), describing different geometries,
yield Hint/H0 0. Indeed, as it follows from Eq. (119), in this limit, the field H tends to zero inside
magnetic samples of virtually any geometry. (The formal exception is the longitudinal cylindrical
geometry shown in Fig. 15, with t/l 0, where Hint = H0 for any finite , but even in it, the last equality
holds only if t/l << 0/.)
Now let us discuss a curious (and practically important) approach to systems with relatively thin,
closed magnetic cores made of several sections of high- magnetic materials, with the cross-section
areas Ak much smaller than the squared lengths lk of the sections – see Fig. 17. If all k >> 0, virtually
all field lines are confined to the interior of the core. Then, applying the macroscopic Ampère law (116)
to a contour C that follows a magnetic field line inside the core (see, for example, the dashed line in Fig.
17), we get the following approximate expression (exactly valid only in the limit k/0, lk2/Ak ):
61Another approach to the undesirable magnetic fields' reduction is the "active shielding" – the external field’s
compensation with the counter-field induced by controlled currents in specially designed wire coils.
Chapter 5 Page 34 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Bk
H dl l
C
l
k
k H k lk
k k
NI . (5.130)
However, since the magnetic field lines stay in the core, the magnetic flux k BkAk should be the same
( ) for each section, so Bk = /Ak. Plugging this condition into Eq. (130), we get
Magnetic NI lk
Ohm law Φ , where Rk . (5.131)
and
reluctance
R
k
k k Ak
I I
C
N
lk N
Note a close analogy of the first of these equations with the usual Ohm law for several resistors
connected in series, with the magnetic flux playing the role of electric current, while the product NI, the
role of the voltage applied to the chain of resistors. This analogy is fortified by the fact that the second
of Eqs. (131) is similar to the expression for the resistance R = l/A of a long, uniform conductor, with
the magnetic permeability playing the role of the electric conductivity . (To sound similar, but still
different from the resistance R, the parameter R is called reluctance.) This is why Eq. (131) is called
the magnetic Ohm law; it is very useful for approximate analyses of systems like ac transformers,
magnetic energy storage systems, etc.
Now let me proceed to a brief discussion of systems with permanent magnets. First of all, using
the definition (108) of the field H, we may rewrite the Maxwell equation (29) for the field B as
B 0 H M 0, i.e. as H M , (5.132)
While this relation is general, it is especially convenient in permanent magnets, where the magnetization
vector M may be approximately considered field-independent.62 In this case, Eq. (132) for H is an exact
analog of Eq. (1.27) for E, with the fixed term –M playing the role of the fixed charge density (more
exactly, of /0). For the scalar potential m, defined by Eq. (120), this gives the Poisson equation
2 m M , (5.133)
similar to those solved, for quite a few geometries, in the previous chapters.
In the case when M is not only field-independent but also uniform inside a permanent magnet’s
volume, then the right-hand sides of Eqs. (132) and (133) vanish both inside the volume and in the
Note that in this approximation, there is no difference between the remanence magnetization MR BR/0, of the
62
Chapter 5 Page 35 of 42
Essential Graduate Physics EM: Classical Electrodynamics
surrounding free space, and give a non-zero effective charge only on the magnet’s surface. Integrating
Eq. (132) along a short path normal to the surface and crossing it, we get the following boundary
conditions:
H n H n in free space H n in magnet M n M cos , (5.134)
where is the angle between the magnetization vector and the outer normal to the magnet’s surface.
This relation is an exact analog of Eq. (1.24) for the normal component of the field E, with the effective
surface charge density (or rather /0) equal to Mcos.
This analogy between the magnetic field induced by a fixed, constant magnetization and the
electric field induced by surface electric charges enables one to reuse the solutions of quite a few
problems considered in Chapters 1-3. Leaving a few such problems for the reader's exercise (see Sec. 7),
let me demonstrate the power of this analogy on just two examples specific to magnetic systems. First,
let us calculate the force necessary to detach the flat ends of two long, uniform rod magnets, of length l
and cross-section area A << l2, with the saturated remanent magnetization M0 directed along their
length – see Fig. 18.
M0 A M0
Fmin ?
U V 0 0 A . (5.135)
2 0 2 0
The gradient of this potential energy is equal to the attraction force F = –(U), trying to reduce ΔU by
decreasing the gap, with the following magnitude:
U 0 M 02 A
F . (5.136)
2
The magnet detachment requires an equal and opposite external force. For a typical permanent magnet,
with 0M0 BR ~ 1T, the force corresponds to a ratio F/A close to 4105 Pa, a few times the normal
atmospheric pressure.
Chapter 5 Page 36 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Now let us consider the situation when similar long permanent magnets (such as the magnetic
needles used in magnetic compasses) are separated, in otherwise free space, by a larger distance d >>
A1/2 – see Fig. 19. For each needle (Fig. 19a), of a length l >> A1/2, the right-hand side of Eq. (133) is
substantially different from zero only in two relatively small areas at the needle’s ends. Integrating the
equation over each area, we see that at distances r >> A1/2 from each end, we may reduce Eq. (132) to
qm
H (r r ) (r r ), i.e. B q m (r r ) (r r ) , (5.137)
0
where r are the ends’ positions, and qm 0M0A, with A being the needle’s cross-section area.63 This
expression for B is completely similar to Eq. (3.32) for the electric displacement D, for the particular
case of two equal and opposite point charges, i.e. with = q[(r – r+) – (r – r+)], with the only
replacement q qm. Since we know the resulting electric field all too well (see, e.g., Eq. (1.7) for E
D/0), we may immediately write a similar expression for the field H:
q r r r r
H r m . (5.138)
4 0 r r 3 r r
3
(a) (b)
H qm qm
qm M0 qm
qm qm
Fig. 5.19. (a) “Magnetic charges” at the ends of a thin permanent-magnet needle and (b) the result of its
breaking into two parts (schematically).
The resulting magnetic field H(r) exerts on another “magnetic charge” q’m, located at some point
r’, the force F = q’mH(r’).64 Hence if two ends of different needles are separated by an intermediate
distance R (A1/2 << R << l, see Fig. 19b), we may neglect one term in Eq. (138), and get the following
“magnetic Coulomb law” for the interaction of the nearest ends:
q m q' m R
F . (5.139)
4 0 R 3
The “only” (but conceptually, crucial!) difference between this interaction and that of the electric point
charges is that the two “magnetic charges” (quasi-monopoles) of a magnetic needle cannot be fully
separated. For example, if we break a needle in the middle in an attempt to bring its two ends further
apart, two new “point charges” appear – see Fig. 19b.
63 Note that the constant coefficient in the definition of qm, and hence in Eqs. (138)-(139), is the matter of
convention. The above choice makes the free-space Maxwell equations D = and B = m (where and m
are the volumic densities of the electric and magnetic charges) pleasantly symmetric.
64 This expression is the magnetic analog of the basic equation F = q’ E(r’) for the electric charges.
e
Chapter 5 Page 37 of 42
Essential Graduate Physics EM: Classical Electrodynamics
There are several solid-state systems where more flexible structures, similar in their
magnetostatics to the needles, may be implemented. First of all, certain (“type-II”) superconductors may
carry so-called Abrikosov vortices – flexible tubes with field-suppressed superconductivity inside, each
carrying one quantum 0 = /e 210-15 Wb of the magnetic flux. Ending on superconductor’s
surfaces, these tubes let their magnetic field lines spread into the surrounding free space, essentially
forming magnetic monopole analogs – of course, with equal and opposite “magnetic charges” qm on
each end of the tube – just as Fig. 19a shows. Such flux tubes are not only flexible but also stretchable,
resulting in several peculiar effects – see Sec. 6.4 for more detail. Another recently found example of
such paired quasi-monopoles is spin chains in the so-called spin ices – crystals with paramagnetic ions
arranged into a specific (pyrochlore) lattice – such as dysprosium titanate Dy2Ti2O7.65 Let me emphasize
again that any reference to magnetic monopoles in such systems should not be taken literally.
In order to complete this section (and this chapter), let me briefly discuss the magnetic field
energy U, for the simplest case of systems with linear magnetic materials. In this case, we still may use
Eq. (55), but if we want to operate only with macroscopic fields, and hence only stand-alone currents,
we should repeat the manipulations that have led us to Eq. (57), using j not from Eq. (35), but from Eq.
(107). As a result, instead of Eq. (57) we get
B H B 2 H 2 Magnetic field
U u r d 3 r , with u , (5.140) energy:
V
2 2 2 linear medium
65See, e.g., L. Jaubert and P. Holdworth, J. Phys. – Cond. Matt. 23, 164222 (2011), and references therein.
66Admittedly, we could get the same result simpler, just by arguing that since the magnetic material fills the
whole volume of a substantial magnetic field in this system, the filling simply increases the vector B at all points,
and hence its flux , and hence L /I by the factor /0 in comparison with the free-space value (75).
Chapter 5 Page 38 of 42
Essential Graduate Physics EM: Classical Electrodynamics
However, we still need to explore the issue of magnetic energy beyond Eq. (140), not only to get
a general expression for it in materials with an arbitrary dependence B(H), but also to finally prove Eq.
(54) and explore its relation with Eq. (53). I will do this at the beginning of the next chapter.
5.1. DC current I flows around a thin wire loop bent into the form of a plane equilateral triangle
with side a. Calculate the magnetic field in the center of the loop.
5.4. For the system studied in the previous problem, but now only in the limit d << w, calculate:
(i) the distribution of the magnetic field in space,
(ii) the vector potential of the field,
(iii) the magnetic force (per unit length) exerted on each strip, and
(iv) the magnetic energy and self-inductance of the loop formed by the strips (per unit length).
z
R
d /2
5.5. Calculate the magnetic field distribution near the center of the system of
0 I
two similar, plane, round, coaxial wire coils, carrying equal but oppositely directed
currents – see the figure on the right. d /2
I
z
R
5.6. The two-coil-system considered in the previous problem now carries d / 2
equal and similarly directed currents – see the figure on the right.67 Calculate what
0 I
should be the ratio d/R for the second derivative 2Bz/z2 to equal zero at z = 0.
d /2
I
67This Helmholtz coils system, producing a highly uniform field near its center, is broadly used in physical
experiment.
Chapter 5 Page 39 of 42
Essential Graduate Physics EM: Classical Electrodynamics
j
5.7. DC current of a constant density j flows along a round cylindrical wire of r R
radius R, with a round cylindrical cavity of radius r cut in it. The cavity’s axis is
parallel to that of the wire but offset from it by a distance d < R – r (see the figure on d 0
the right). Calculate the magnetic field inside the cavity.
I
5.8. Calculate the magnetic field’s distribution along the axis of a straight R
solenoid (see Fig. 6a, partly reproduced on the right) of a finite length l, and a N l
round cross-section of radius R. Assume that the solenoid has many (N >> 1, l/R) I
wire turns, uniformly distributed along its length.
5.9. A thin round disk of radius R, carrying an electric charge of a constant areal density ,
rotates about its axis with a constant angular velocity . Calculate:
(i) the magnetic field on the disk’s axis,
(ii) the magnetic moment of the disk,
and relate these results.
5.10. A thin spherical shell of radius R, with charge Q uniformly distributed over its surface,
rotates about its diameter with a constant angular velocity . Calculate the distribution of the magnetic
field everywhere in space.
5.11. A sphere of radius R, made of an insulating material with a uniform electric charge density
, rotates about its diameter with a constant angular velocity . Calculate the magnetic field distribution
inside the sphere and outside it.
5.12. A conducting sphere with no total electric charge is rotated about its diameter with a
constant angular velocity , in a uniform constant external magnetic field B directed along the rotation
axis. Assuming that the sphere’s contribution to the magnetic field is negligibly small, calculate the
stationary distribution of the electric charge density inside the sphere and on its surface, and the
electrostatic potential both inside and outside the sphere. Quantify the above assumption.
Chapter 5 Page 40 of 42
Essential Graduate Physics EM: Classical Electrodynamics
5.14. The reader is hopefully familiar with the classical Hall effect in the usual rectangular Hall
bar geometry – see the left panel of the figure below. However, the effect takes a different form in the
so-called Corbino disk – see the right panel of the figure below. (Dark shading shows electrodes, with
no appreciable resistance.) Analyze the effect in both geometries, assuming that in both cases, the
conductors are thin and planar, have a constant Ohmic conductivity and charge carrier density n, and
that the applied magnetic field B is uniform and normal to conductors’ planes.
I I
I I R2
w B
R1
B
l
5.15. A wire with a round cross-section of radius a has been bent into a round loop of radius R
>> a. Prove the formula for its self-inductance, which was mentioned at the end of Sec. 5.3 of the
lecture notes: L = 0R ln(cR/a), with c ~ 1.
5.19.* Use the classical picture of the orbital (“Larmor”) diamagnetism, discussed in Sec. 5, to
calculate its (small) contribution B(0) to the magnetic field B felt by an atomic nucleus, treating the
electrons of the atom as a spherically symmetric cloud with an electric charge density (r). Express the
result via the value (0) of the electrostatic potential of the electron cloud, and use this expression for a
crude numerical estimate of the ratio B(0)/B for the hydrogen atom.
Chapter 5 Page 41 of 42
Essential Graduate Physics EM: Classical Electrodynamics
5.22. Solve the magnetic shielding problem similar to that discussed in Sec. 5.6 of the lecture
notes, but for a spherical rather than cylindrical shell, with the same central cross-section as shown in
Fig. 16. Compare the efficiency of those two shields, for the same shell’s permeability , and the same
b/a ratio.
5.23. Calculate the magnetic field’s distribution around a spherical permanent magnet with
uniform magnetization M0 = const.
5.24. A limited volume V is filled with a magnetic material with field-independent magnetization
M(r). Write explicit expressions for the magnetic field induced by the magnetization and its potential,
and recast these expressions into the forms that are more convenient when M(r) = M0 = const
throughout the volume.
5.26. A flat end of a long straight permanent magnet, similar to that considered in the previous
problem but with an arbitrary cross-section of area A, is stuck to a flat surface of a large sample of a
linear magnetic material with a very high permeability >> 0. Calculate the normally directed force
needed to detach them.
5.27. A permanent magnet with a uniform magnetization M0 has the form of a spherical shell
with an internal radius R1 and an external radius R2 > R1. Calculate the magnetic field inside the shell.
5.28. A very broad film of thickness 2t is permanently magnetized normally to its plane, with a
periodic checkerboard pattern, with the square of area aa:
x y
M z t n z M x, y , with M x, y M 0 sgn cos cos .
a a
Calculate the magnetic field’s distribution in space.
5.29.* Based on the discussion of the quadrupole electrostatic lens in Sec. 2.4, suggest the
permanent-magnet systems that may similarly focus particles moving close to the system’s axis, for the
cases when each particle carries:
(i) an electric charge,
(ii) no net electric charge, but a spontaneous magnetic dipole moment m of a certain orientation.
Chapter 5 Page 42 of 42
Essential Graduate Physics EM: Classical Electrodynamics
Chapter 6. Electromagnetism
This chapter discusses two major effects that arise when electric and magnetic fields change over time:
the “electromagnetic induction” of an additional electric field by changing the magnetic field, and the
reciprocal effect of the “displacement currents”– actually, the induction of an additional magnetic field
by changing electric field. These two phenomena, which make time-dependent electric and magnetic
fields inseparable (hence the term “electromagnetism”1), are reflected in the full system of Maxwell
equations, valid for an arbitrary electromagnetic process. On the way toward this system, I will make a
brief detour to review the electrodynamics of superconductivity, which (besides its own significance),
provides a perfect platform for discussion of the important general issue of gauge invariance.
Φ Bn d 2 r , (6.1)
S
through a surface S limited by a closed contour C, changes in time by whatever reason (e.g., either due
to a change of the magnetic field B (as in Fig.1), or the contour’s motion, or its deformation, or any
combination of the above), it induces an additional, vortex-like electric field Eind directed along the
contour – see Fig. 1.
Bt Bt
C C
E ind E ind I Vind / R
S Vind S
Fig. 6.1. Two simplest ways to observe the Faraday electromagnetic induction.
The exact distribution of Eind in space depends on the system’s details, but its integral along the
contour C, called the inductive electromotive force (e.m.f.), obeys a very simple Faraday induction law:
1 It was coined by H. Ørsted in 1820 in the context of his experiments – see the previous chapter.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
dΦ Faraday
Vind E ind dr . (6.2) induction
C
dt law
(In the Gaussian units, the right-hand side of this formula has an additional coefficient of 1/c.)
It is straightforward (and hence left for the reader’s exercise) to show that this e.m.f. may be
measured, for example, either by inserting a voltmeter into a conducting loop following the contour C or
by measuring the small current I = Vind/R it induces in a thin wire with a sufficiently large Ohmic
resistance R,2 whose shape follows that contour – see Fig. 1. (Actually, these methods are not entirely
different, because a typical voltmeter measures voltage by the small Ohmic current it drives through the
pre-calibrated high internal resistance of the device.) In the context of the latter approach, the minus
sign in Eq. (2) may be described by the following Lenz rule: the magnetic field of the induced current I
provides a partial compensation of the change of the original flux (t) with time.3
In order to recast Eq. (2) in a differential form, more convenient in many cases, let us apply to
the contour integral in it the Stokes theorem, which was repeatedly used in Chapter 5. The result is
Now combining Eqs. (1)-(3), for a contour C whose shape does not change in time (so that the
integration along it is interchangeable with the time derivative), we get
B 2
E
S
ind d r 0.
t n
(6.4)
Since the induced electric field is an addition to the gradient field (1.33) created by electric
charges, for the net field we may write E = Eind – . However, since the curl of any gradient field is
zero,4 () = 0, Eq. (4) remains valid even for the net field E. Since this equation should be correct
for any closed area S, we may conclude that
B Faraday law:
E 0 (6.5) differential
t form
at any point. This is the final (time-dependent) form of this Maxwell equation. Superficially, it may look
that Eq. (5) is less general than Eq. (2); for example, it does not describe any electric field, and hence
any e.m.f. in a moving loop, if the field B is constant in time, even if the magnetic flux (1) through the
loop does change in time. However, this is not true; in Chapter 9 we will see that in the reference frame
moving with the loop, the e.m.f. does appear.5
2 Such induced current is sometimes called the eddy current, though most often this term is reserved for the
distributed currents induced by changing magnetic fields in bulk conductors – see Sec. 3 below.
3 Let me also hope that the reader is familiar with the paradox arising at attempts to measure V
ind with a voltmeter
without its insertion into the wire loop; if not, I would highly recommend them to solve the offered Problem 2.
4 See, e.g., MA Eq. (11.1).
5 I have to admit that from the beginning of the course, I was carefully sweeping under the rug a very important
question: in what exactly reference frame(s) all the equations of electrodynamics are valid? I promise to discuss
this issue in detail later in the course (in Chapter 9), and for now would like to get away with a very short answer:
all the formulas discussed so far are valid in any inertial reference frame, as defined in classical mechanics – see,
e.g., CM Sec. 1.3; however, the fields E and B have to be measured in the same frame.
Chapter 6 Page 2 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Now let us reformulate Eq. (5) in terms of the vector potential A. Since the induction effect does
not alter the fundamental relation B = 0, we still may represent the magnetic field as prescribed by
Eq. (5.27), i.e. as B = × A. Plugging this expression into Eq. (5), and changing the order of the
temporal and spatial differentiation, we get
A
E 0. (6.6)
t
Hence we can use the same argumentation as in Sec. 1.3 (there applied to the vector E alone) to
represent the expression in the parentheses as –, so we get
Fields via A
potentials E , B A. (6.7)
t
It is very tempting to interpret the first term of the right-hand side of the expression for E as the
one describing the electromagnetic induction alone, and the second term as representing a purely
electrostatic field induced by electric charges. However, the separation of these two terms is, to a certain
extent, conditional. Indeed, let us consider the gauge transformation already mentioned in Sec. 5.2,
A A , (6.8)
that, as we already know, does not change the magnetic field. According to Eq. (7), to keep the full
electric field intact (gauge-invariant) as well, the scalar electric potential has to be transformed
simultaneously, as
, (6.9)
t
leaving the choice of an addition to restricted only by the Laplace equation – since the full should
satisfy the Poisson equation (1.41) with a gauge-invariant right-hand side. We will return to the
discussion of the gauge invariance in Sec. 4.
6As a reminder, the magnetic component of the Lorentz force (5.10), vB, is always perpendicular to the particle
velocity v, so the magnetic field B itself cannot perform any work on moving charges, i.e. on currents.
Chapter 6 Page 3 of 38
Essential Graduate Physics EM: Classical Electrodynamics
where the integral is over the volume of the system. Now expressing the current density j from the
macroscopic Maxwell equation (5.107), j = H, and then applying the vector algebra identity7
According to the divergence theorem, the second integral in the right-hand of this equality is
equal to the flux of the so-called Poynting vector S E H through the surface limiting the considered
volume V. Later in the course we will see that this flux represents, in particular, the power of
electromagnetic radiation through the surface. If such radiation is negligible (as it always is if the field
variation is sufficiently slow), the surface may be selected sufficiently far, so that the flux of S vanishes.
In this case, we may express E from the Faraday induction law (5) to get
B
U δt Hd r H Bd r .
3 3
(6.13)
V
t V
Just as in the electrostatics (see Eqs. (1.65) and (3.73), and their discussion), this relation may be
interpreted as the variation of the magnetic field energy U of the system, and represented in the form
Magnetic
U u r d 3 r , with u H δB . (6.14) energy’s
variation
V
Chapter 6 Page 4 of 38
Essential Graduate Physics EM: Classical Electrodynamics
where V = Al is the cylinder’s volume. Now if we try to calculate the static (equilibrium) value of the
field from the minimum of this potential energy, we get evident nonsense: B = 0 (WRONG!).8
The situation may be readily rectified by using the notion of the Gibbs potential energy, just as it
was done for the electric field in Sec. 3.5 (and implicitly in the end of Sec. 1.3). According to Eq. (14),
in magnetostatics, the Cartesian components of the field H(r) play the role of the generalized forces,
while those of the field B(r), of the generalized coordinates (per unit volume).9 As the result, the Gibbs
potential energy, whose minimum corresponds to the stable equilibrium of the system under the effect of
a fixed generalized force (in our current case, of the fixed external field Hext), is
Gibbs
potential U G u G r d 3 r , with u G r u r H ext r Br , (6.17)
energy
V
– the expression parallel to Eq. (3.78). For a system with linear magnetics, we may use, for the energy
density u(r), our result (15), getting the following Gibbs energy’s density:
1 1
u G (r ) B B H ext B B H ext 2 const , (6.18)
2 2
where “const” means a term independent of the field B inside the sample. For our simple cylindrical
system, with its uniform fields, Eqs. (17)-(18) gives the following full Gibbs energy of the sample:
B int H ext 2
UG V const , (6.19)
2
whose minimum immediately gives the correct stationary value Bint = Hext, i.e. Hint Bint/ = Hext,
which was already obtained in Sec. 5.6 in a different way, from the boundary condition (5.117).
Now notice that with this result on hand, Eq. (18) may be rewritten in a different form:
1 B B2
u G (r ) BB B , (6.20)
2 2
similar to Eq. (15) for u(r), but with an opposite sign. This sign dichotomy explains that of Eqs. (5.53)
and Eq. (5.54); indeed, as was already noted in Sec. 5.3, the former of these expressions gives the
potential energy whose minimum corresponds to the equilibrium of a system with fixed currents. (In our
current example, these are the external stand-alone currents inducing the field Hext.) So, the energy Uj
given by Eq. (5.53) is essentially the Gibbs energy UG defined by Eqs. (17) and (for the equilibrium
state of linear magnetic media) by Eq. (20), while Eq. (5.54) is just another form of Eq. (15) – as was
explicitly shown in Sec. 5.3.10
8 This erroneous result cannot be corrected by just adding the energy of the field outside the cylinder because in
the limit A 0, this field is not affected by the internal field B.
9 Note an aspect in that the analogy with electrostatics is not quite complete. Indeed, according to Eq. (3.76), in
electrostatics, the role of a generalized coordinate is played by the “would-be” field D, and that of the generalized
force, by the actual (if macroscopic) electric field E. This difference may be traced back to the fact that the
electric field E may perform work on a moving charged particle, while the magnetic field cannot. However, this
difference does not affect the full analogy of the expressions (3.73) and (15) for the field energy density in linear
media.
10 As was already noted in Sec. 5.4, one more example of the energy U (i.e. U ) is given by Eq. (5.100).
j G
Chapter 6 Page 5 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Let me complete this section by stating that the difference between the energies U and UG is not
properly emphasized (or even left obscure) in some textbooks, so the reader is advised to get additional
clarity by solving a few additional simple problems – for example, by spelling out these energies for a
long straight solenoid (Fig. 5.6a), and then using the results to calculate the pressure exerted by the
magnetic field on the solenoid’s walls (windings) and the longitudinal forces exerted on its ends.
If the conductor is uniform, i.e. the coefficients and are constant inside it, the whole system of Eqs.
(21)-(22) may be reduced to just one simple equation. Indeed, a sequential substitution of these
equations into each other, using a well-known vector-algebra identity12 in the middle, yields:
B
t
1 1
E j ( H )
1
( B)
1
( B) 2 B
(6.23)
1 2
B.
Thus we have arrived, without any further assumptions, at a rather simple partial differential
equation. Let us use it for an analysis of the so-called skin effect, the phenomenon of an Ohmic
conductor’s self-shielding from the alternating (ac) magnetic field. In its simplest geometry (Fig. 2a), an
11 Obviously, in free space, the last replacement is unnecessary, because all charges and currents may be treated as
“stand-alone” ones.
12 See, e.g., MA Eq. (11.3).
Chapter 6 Page 6 of 38
Essential Graduate Physics EM: Classical Electrodynamics
external source (which, at this point, does not need to be specified) produces, near a plane surface of a
bulk conductor, a spatially-uniform ac magnetic field H(0)(t) parallel to the surface.13
y (a) (b)
0 , C1 C2
0
Fig. 6.2. (a) The skin effect in
H 0 H 0 H0 the simplest, planar geometry,
H ( 0 ) and (b) two Ampère contours,
H x C1 and C2, for deriving the
n J
“macroscopic” (C1) and the
0 s x “coarse-grain” (C2) boundary
s conditions for H.
Selecting the coordinate system as shown in Fig. 2a, we may express this condition as
H x 0 H 0 (t )n y . (6.24)
The translational symmetry of our simple problem within the surface plane [y, z] implies that inside the
conductor, /y = /z = 0 as well, and H = H(x, t)ny even at x 0, so Eq. (23) for the conductor’s
interior is reduced to a differential equation for just one scalar function H(x, t) = B(x, t)/:
H 1 2H
, for x 0 . (6.25)
t x 2
This equation may be further simplified by noticing that due to its linearity, we may use the linear
superposition principle for the time dependence of the field,14 via expanding it, as well as the external
field (24), into the Fourier series:
H ( x, t ) H ( x)e it , for x. 0,
(6.26)
H 0
(t ) H 0 e it , for x 0,
and arguing that if we know the solution for each frequency component of the series, the whole field
may be found through the straightforward summation (26) of these solutions.
For each single-frequency component, Eq. (25) is immediately reduced to an ordinary
differential equation for the complex amplitude H(x):15
13 Due to the simple linear relation B = H between the fields B and H, it does not matter too much which of
them is used for the solution of this problem, with a slight preference for H, due to the simplicity of Eq. (5.117) –
the only boundary condition relevant for this simple geometry.
14 Another way to exploit the linearity of Eq. (6.25) is to use the spatial-temporal Green’s function approach to
explore the dependence of its solutions on various initial conditions. Unfortunately, because of a lack of time, I
have to leave an analysis of this opportunity for the reader’s exercise.
15 Let me hope that the reader is not intimidated by the (very convenient) use of such complex variables for
describing real fields; their imaginary parts always disappear at the final summation (26). For example, if the
Chapter 6 Page 7 of 38
Essential Graduate Physics EM: Classical Electrodynamics
1 d2
i H H . (6.27)
dx 2
From the theory of linear ordinary differential equations, we know that Eq. (27) has the following
general solution:
x x
H ( x) H e H e , (6.28)
where the constants are the roots of the characteristic equation that may be obtained by the
substitution of any of these two exponents into the initial differential equation. For our particular case,
the characteristic equation following from Eq. (27) is simply
2
i (6.29)
and its roots are, obviously,
1 i
i 1 / 2 1 / 2 . (6.30)
2
For our problem, the field cannot grow exponentially at x +, so only one of the coefficients,
namely the H– corresponding to the decaying exponent, with Re – < 0, may be different from zero, i.e.
H(x) = H(0)exp{–x}. To find the constant factor H(0), we can integrate the macroscopic Maxwell
equation H = j along a pre-surface contour – say, the contour C1 shown in Fig. 2b. The right-hand
side’s integral is negligible because the stand-alone current density j does not include the “genuinely-
surface” currents responsible for the magnetic permeability – see Fig. 5.12. As a result, we get the
boundary condition similar to Eq. (5.117) for the stationary magnetic field: H = const at x = 0, giving
us
H 0, t H 0 t , i.e. H 0 H 0 , (6.31)
so the final solution of our boundary problem may be represented as
x x
H ( x) H 0 exp x H 0 exp exp i t , (6.32)
s s
where the constant s, with the dimension of length, is called the skin depth:
1/ 2
1 2 Skin
s . (6.33)
Re
depth
This solution describes the skin effect: the penetration of the ac magnetic field, and the eddy
currents j, into a conductor only to a finite depth of the order of s. Let me give a few numerical
examples of this depth: for copper at room temperature, s 1 cm at the usual ac power distribution
frequency of 60 Hz, and is of the order of just 1 m at a few GHz, i.e. at typical frequencies of cell
phone signals and kitchen microwave magnetrons. On the other hand, for lightly salted water, s is close
to 250 m at just 1 Hz (with significant implications for radio communications with submarines), and of
external field is purely sinusoidal, with the actual (positive) frequency , each sum in Eq. (26) has just two terms,
with complex amplitudes H and H- = H*, so their sum is always real. (For a more detailed discussion of this
issue, see, e.g., CM Sec. 5.1.)
Chapter 6 Page 8 of 38
Essential Graduate Physics EM: Classical Electrodynamics
the order of 1 cm at a few GHz (explaining, in particular, the nonuniform heating of a soup bowl in a
microwave oven).16
Let me hope that the equality chain (23) makes the physics of this effect very clear: the external
electric field E, which is Faraday-induced by an external ac magnetic field, drives the eddy currents j,
which in turn induce their own magnetic field that eventually (at x ~ s) compensates the external one.
Let us quantify these E and j. Since we have used, in particular, relations j = H = × B/, and E =
j/, and spatial differentiation of an exponent yields a similar exponent, the electric field and current
density have the same spatial dependence as the magnetic field, i.e. penetrate the conductor only by
distances of the order of s(). Their vectors are directed normally to B, while still being parallel to the
conductor’s surface:17
j x H ( x) n z , E x H ( x) n z . (6.34)
We may use these expressions, in particular, to calculate the time-averaged power density (4.39)
of the energy dissipation, for the important case of a sinusoidal (“monochromatic”) field H(x, t) = H(x)
cos(t + ), and hence sinusoidal eddy currents: j(x, t) = j(x) cos(t + ’):
p x . (6.35)
2 2 s2
Now the (elementary) integration of this expression along the x-axis (through all the skin depth), using
the exponential law (6.32), gives us the following average power of the energy loss per unit area:
Energy
dP 1 s
p x dx
loss 2 2
at skin H 0 H 0 . (6.36)
effect dA 0 2 s 4
We will extensively use this expression in the next chapter to calculate the energy losses in microwave
waveguides and resonators with conducting (practically, metallic) walls, and for now let me note only
that according to Eqs. (33) and (36), for a fixed magnetic field amplitude, the losses grow with
frequency as 1/2.
One more important remark concerning Eqs. (34): integrating the first of them over x, with the
help of Eq. (32), we may see that the linear density J of the surface currents (measured in A/m), is
simply and fundamentally related to the applied magnetic field:
J j x dx H 0 n z . (6.37)
0
Since this relation does not have any frequency-dependent factors, we may sum it up for all frequency
components, and get a universal relation
J t H 0 t n z H 0 t n y n x H 0 t n x H 0 t n , (6.38a)
16 Let me hope that the reader’s physical intuition makes it evident that the skin effect remains conceptually the
same for samples of any shape, besides possibly some quantitative details of the field distribution.
17 The loop (vortex) character of the induced current lines, responsible for the term “eddy”, is not very apparent in
the 1D geometry explored above, with the near-surface currents (Fig. 2b) looping only implicitly, at z .
Chapter 6 Page 9 of 38
Essential Graduate Physics EM: Classical Electrodynamics
(where n = –nx is the outer normal to the surface – see Fig. 2b) or, in a different form,
Coarse-grain
H t n J (t ), (6.38b) boundary
relation
where H is the full change of the field through the skin layer. This simple coarse-grain relation
(independent of the choice of coordinate axes), is also independent of the used constituent relations (22),
and is by no means occasional. Indeed, it may be readily obtained from the macroscopic Ampère law
(5.116), by applying it to a contour drawn around a fragment of the surface, extending under it
substantially deeper than the skin depth – see the contour C2 in Fig. 2b. Hence, Eq. (38) is valid
regardless of the exact law of the field penetration.
For the skin effect, this fundamental relationship between the linear current density and the
external magnetic field implies that the skin effect’s implementation does not necessarily require a
dedicated ac magnetic field source. For example, the effect takes place in any wire that carries an ac
current, leading to a current’s concentration in a surface sheet of thickness ~s. (Of course, the
quantitative analysis of this problem in a wire with an arbitrary cross-section may be technically
complicated, because it requires solving Eq. (23) for the corresponding 2D geometry; even for the round
cross-section, the solution involves the Bessel functions.) In this case, the ac magnetic field outside the
conductor, which still obeys Eq. (38), may be better interpreted as the effect, rather than the cause, of
the ac current flow.
Finally, please mind the limited validity of all the above results. First, for the quasistatic
approximation to be valid, the field frequency should not be too high, so the displacement current
effects are negligible. (Again, this condition will be quantified in Sec. 7 below; it will show that for
metals, the condition is violated only at extremely high frequencies above ~1018 s-1.) A more practical
upper limit on is that the skin depth s should stay much larger than the mean free path l of charge
carriers,18 because beyond this point, the constituent relation between the vectors j(r) and E(r) becomes
essentially non-local. Both theory and experiment show that at s below l, the skin effect persists, but
acquires a frequency dependence slightly different from Eq. (33): s –1/3 rather than –1/2.
Historically, this anomalous skin effect has been very useful for the measurements of the Fermi surfaces
of metals.19
18 A discussion of the mean free path may be found, for example, in SM Chapter 6. In very clean metals at very
low temperatures, s may approach l at frequencies as low as ~1 GHz, but at room temperature, the crossover
between the normal to the anomalous skin effect takes place only at ~ 100 GHz.
19 See, e.g., A. Abrikosov, Introduction to the Theory of Normal Metals, Academic Press, 1972.
20 Discovered experimentally in 1911 by Heike Kamerlingh Onnes.
Chapter 6 Page 10 of 38
Essential Graduate Physics EM: Classical Electrodynamics
magnetic field penetration at all. Experiment shows something substantially different: weak magnetic
fields do penetrate into superconductors by a material-specific distance L ~ 10-7-10-6 m, the so-called
London’s penetration depth,21 which is virtually frequency-independent until the skin depth s, of the
same material in its “normal” state, i.e. the absence of superconductivity, becomes less than L. (This
crossover happens typically at frequencies ~ 1013-1014 s-1.) The smallness of L on the human scale
means that the magnetic field is pushed out from macroscopic samples at their transition into the
superconducting state.
This Meissner-Ochsenfeld effect, discovered experimentally in 1933,22 may be partly understood
using the following classical reasoning. Our discussion of the Ohm law in Sec. 4.2 implied that the
current’s (and hence the electric field’s) frequency is either zero or sufficiently low. In the classical
Drude reasoning, this is acceptable while << 1, where is the effective carrier scattering time
participating in Eqs. (4.12)-(4.13). If this condition is not satisfied, we should take into account the
charge carrier inertia; moreover, in the opposite limit >> 1, we may neglect the scattering at all.
Classically, we can describe the charge carriers in such a “perfect conductor” as particles with a non-
zero mass m, which are accelerated by the electric field following the 2nd Newton law (4.11),
mv F qE , (6.39)
so the current density j = qnv that they create, changes in time as
2
j qnv q n E . (6.40)
m
In terms of the Fourier amplitudes of the functions j(t) and E(t), this means
q2n
i j E . (6.41)
m
Comparing this formula with the relation j = E implied in the last section, we see that we can use all
its results with the following replacement:
q 2n
i . (6.42)
m
This change replaces the characteristic equation (29) with
2 m q 2 n
i , i.e. 2 , (6.43)
iq 2 n m
i.e. replaces the skin effect with the field penetration by the following frequency-independent depth:
1/ 2
1 m
2 . (6.44)
q n
Superficially, this means that the field decay into the superconductor does not depend on frequency:
21Named so to acknowledge the pioneering theoretical work of brothers Fritz and Heinz London – see below.
22 It is hardly fair to shorten this name to just the “Meissner effect” as it is frequently done, because of the
reportedly crucial contribution by Robert Ochsenfeld, then a Walther Meissner’s student, to the discovery.
Chapter 6 Page 11 of 38
Essential Graduate Physics EM: Classical Electrodynamics
H ( x, t ) H (0, t )e x / , (6.45)
thus explaining the Meissner-Ochsenfeld effect.
However, there are two problems with this result. First, for the parameters typical for good
metals (q = –e, n ~ 1029 m-3, m ~ me, 0), Eq. (44) gives ~ 10-8 m, one or two orders of magnitude
lower than the experimental values of L. Experiment also shows that the penetration depth diverges at T
Tc, which is not predicted by Eq. (44).
The second, much more fundamental problem with Eq. (44) is that it has been derived for
>> 1. Even if we assume that somehow there is no scattering at all, i.e. = , at 0 both parts of the
characteristic equation (43) vanish, and we cannot make any conclusion about . This is not just a
mathematical artifact we could ignore. For example, let us place a non-magnetic metal into a static
external magnetic field at T > Tc. The field would completely penetrate the sample. Now let us cool it.
As soon as the temperature is decreased below Tc, the above calculations would become valid,
forbidding the penetration into the superconductor of any change of the field, so the initial field would
be “frozen” inside the sample. The Meissner-Ochsenfeld experiments have shown something completely
different: as T is lowered below Tc, the initial field is being expelled out of the sample.
The resolution of these contradictions is provided by quantum mechanics. As was explained in
1957 in a seminal work by J. Bardeen, L. Cooper, and J. Schrieffer (commonly referred to as the BCS
theory), superconductivity is due to the correlated motion of electron pairs, with opposite spins and
nearly opposite momenta. Such Cooper pairs, each with the electric charge q = –2e and zero spin, may
form only in a narrow energy layer near the Fermi surface, of a certain thickness (T). This parameter
(T), which may be also interpreted as the binding energy of the pair, tends to zero at T Tc, while at T
<< Tc it has a virtually constant value (0) 3.5 kBTc, of the order of a few meV for most
superconductors. This fact readily explains the relatively low spatial density of the Cooper pairs: np ~
n(T)/F ~ 1026 m-3. With the correction n np, Eq. (44) for the penetration depth becomes
1/ 2
m London’s
L 2 . (6.46) penetration
q n depth
p
This result diverges at T Tc, and generally fits the experimental data reasonably well, at least for the
so-called “clean” superconductors with the mean free path l = vF (where vF ~ (2mF)1/2 is the r.m.s.
velocity of electrons on the Fermi surface) much longer than the Cooper pair size – see below.
The smallness of the coupling energy (T) is also a key factor in the explanation of the
Meissner-Ochsenfeld effect. Because of Heisenberg’s quantum uncertainty relation rp ~ , the spatial
extension of the Cooper-pair’s wavefunction (the so-called coherence length of the superconductor) is
relatively large: ~ r ~ /p ~ vF/(T) ~ 10-6 m. As a result, np3 >> 1, meaning that the
wavefunctions of the pairs are strongly overlapped in space. Due to their integer spin, Cooper pairs
behave like bosons, which means in particular that at low temperatures they exhibit the so-called Bose-
Einstein condensation onto the same ground energy level g.23 This means that the quantum frequency
23 A quantitative discussion of the Bose-Einstein condensation of bosons may be found in SM Sec. 3.4, though
the full theory of superconductivity is more complicated because it has to describe the condensation taking place
simultaneously with the formation of effective bosons (Cooper pairs) from fermions (single electrons). For a
Chapter 6 Page 12 of 38
Essential Graduate Physics EM: Classical Electrodynamics
= g/ of the time evolution of each pair’s wavefunction = exp{–it} is exactly the same and that
the phases of the wavefunctions, defined by the relation
e i , (6.47)
coincide, so the electric current is carried not by individual Cooper pairs but rather by their Bose-
Einstein condensate described by a single wavefunction (47). Due to this coherence, the quantum effects
(which are, in the usual Fermi-gases of single electrons, masked by the statistical spread of their
energies, and hence of their phases), become very explicit – “macroscopic”.
To illustrate this, let us write the well-known quantum-mechanical formula for the probability
current density of a free, non-relativistic particle,24
jw
i
2m
c.c. 1
2m
i c.c. , (6.48)
where c.c. means the complex conjugate of the previous expression. Now let me borrow one result that
will be proved later in this course (in Sec. 9.7) when we discuss the analytical mechanics of a charged
particle moving in an electromagnetic field. Namely, to account for the magnetic field effects, the
particle’s kinetic momentum p mv (where v dr/dt is the particle’s velocity) has to be distinguished
from its canonical momentum,25
P p qA . (6.49)
where A is the field’s vector potential defined by Eq. (5.27). In contrast with the Cartesian components
pj = mvj of the kinetic momentum p, the canonical momentum’s components are the generalized
momenta corresponding to the Cartesian components rj of the radius-vector r, considered as generalized
coordinates of the particle: Pj = L /vj, where L is the particle’s Lagrangian function. According to the
general rules of transfer from classical to quantum mechanics,26 it is the vector P whose operator (in the
coordinate representation) equals –i, so the operator of the kinetic momentum p = P – qA is –i +
qA. Hence, to account for the magnetic field27 effects, we should make the following replacement,
jw
1
2m
i qA c.c. . (6.51)
This expression becomes more transparent if we take the wavefunction in form (47); then
detailed, but still very readable coverage of the physics of superconductors, I can recommend the reader the
monograph by M. Tinkham, Introduction to Superconductivity, 2nd ed., McGraw-Hill, 1996.
24 See, e.g., QM Sec. 1.4, in particular Eq. (1.47).
25 I am sorry to use traditional notations p and P for the momenta – the same symbols which were used for the
electric dipole moment and polarization in Chapter 3. I hope there will be no confusion because the latter notions
are not used in this section.
26 See, e.g., CM Sec. 10.1, in particular Eq. (10.26).
27 The account of the electric field is easier, because the related energy q of the particle may be directly included
in the potential energy operator.
Chapter 6 Page 13 of 38
Essential Graduate Physics EM: Classical Electrodynamics
2 q
jw A . (6.52)
m
This relation means, in particular, that in order to keep jw gauge-invariant, the transformation (8)-(9) has
to be accompanied by a simultaneous transformation of the wavefunction’s phase:
q
. (6.53)
It is fascinating that the quantum-mechanical wavefunction (or more exactly, its phase) is not gauge-
invariant, meaning that you may change it in your mind – at your free will! Again, this does not change
any observable (such as jw or the probability density *), i.e. any experimental results.
Now for the electric current density of the whole superconducting condensate, Eq. (52) yields
the following constitutive relation:
qn p 2 q
j j w qn p A , (6.54) Supercurrent
density
m
The formula shows that this supercurrent may be induced by the dc magnetic field alone and does not
require any electric field. Indeed, for the simple 1D geometry shown in Fig. 2a, j(r) = j(x)nz, A(r) = A(x)
nz, and /z = 0, so the Coulomb gauge condition (5.48) is satisfied for any choice of the gauge function
(x). For the sake of simplicity we can choose this function to provide (r) const,28 so
q 2 np 1
j A A. (6.55)
m L2
where L is given by Eq. (46), and the field is assumed to be small and hence not affecting the
probability 2 (here normalized to 1 in the absence of the field). This is the so-called London equation,
proposed (in a different form) by F. and H. London in 1935 for the Meissner-Ochsenfeld effect’s
explanation. Combining it with Eq. (5.44), generalized for a linear magnetic medium by the replacement
0 , we get
1
2A 2 A , (6.56)
L
For our 1D geometry, this simple differential equation, similar to Eq. (23), has an exponential solution
similar to Eq. (32):
x x x
A( x) A(0) exp , B( x) B(0) exp , j ( x) j (0) exp , (6.57)
L L L
which shows that the magnetic field and the supercurrent penetrate into a superconductor only by
London’s penetration depth L, regardless of frequency.29 By the way, integrating the last result through
the penetration layer, and using the vector potential’s definition, B = A (for our geometry, giving
28 This is the so-called London gauge; for our simple geometry, it is also the Coulomb gauge (5.48).
29 Since at T > 0, not all electrons in a superconductor form Cooper pairs, at any frequency 0 the unpaired
electrons provide energy-dissipating Ohmic currents, which are not described by Eq. (54). These losses become
very substantial when the frequency becomes so high that the skin-effect length s of the material becomes less
than L. For typical metallic superconductors, this crossover takes place at frequencies of a few hundred GHz, so
even for microwaves, Eq. (57) still gives a fairly accurate description of the field penetration.
Chapter 6 Page 14 of 38
Essential Graduate Physics EM: Classical Electrodynamics
B(x) = dA(x)/dx = –LA(x)) we may readily verify that the linear density J of the surface supercurrent
still satisfies the universal coarse-grain relation (38).
This universality should bring to our attention the following common feature of the skin effect
(in “normal” conductors) and the Meissner-Ochsenfeld effect (in superconductors): if the linear size of a
bulk sample is much larger than, respectively, s or L, than B = 0 in the dominating part of its interior.
According to Eq. (5.110), a formal description of such conductors (valid only on a coarse-grain scale
much larger than either s or L), may be achieved by formally treating the sample as an ideal
diamagnet, with = 0. In particular, we can use this description and Eq. (5.124) to immediately obtain
the magnetic field’s distribution outside of a bulk sphere:
R3
B 0 H 0 m , with m H 0 r 2 cos , for r R . (6.58)
2r
Figure 3 shows the corresponding surfaces of equal potential m. It is evident that the magnetic
field lines (which are normal to the equipotential surfaces) bend to become parallel to the surface near it.
This pattern also helps to answer the question that might arise at making the assumption (24):
what happens to bulk conductors placed into a normal ac magnetic field – and to superconductors in a
normal dc magnetic field as well? The answer is: the field is deformed outside of the conductor to
sustain the following coarse-grain boundary condition:30
Coarse-
grain
boundary
Bn surface 0, (6.59)
condition
which follows from Eq. (5.118) and the coarse-grain requirement Binside = 0.
This answer should be taken with reservations. For normal conductors, it is only valid at
sufficiently high frequencies where the skin depth (33) is relatively small: s << a, where a is the scale
of the conductor’s linear size – for a sphere, a ~ R. In superconductors, this simple picture requires not
only that s << a, but also that magnetic field is relatively low because strong fields do penetrate
30 Sometimes this boundary condition, as well as the (compatible) Eq. (38), are called “macroscopic”. However,
this term may lead to confusion with the genuine macroscopic boundary conditions (5.117)-(5.118), which also
ignore the atomic-scale microstructure of the “effective currents” jef = M, but (as was shown earlier in this
section) still allow explicit, detailed accounts of the skin-current (34) and supercurrent (55) distributions.
Chapter 6 Page 15 of 38
Essential Graduate Physics EM: Classical Electrodynamics
superconductors, destroying superconductivity (either completely or partly), and as a result violating the
Meissner-Ochsenfeld effect – see the next section.
n 0
Fig. 6.4. (a) A closed, flux-quantizing superconducting ring, (b) a ring with a narrow slit,
and (c) a Superconducting QUantum Interference Device (SQUID).
From the last section’s discussion, we know that deep inside the wire the supercurrent is
exponentially small. Integrating Eq. (54) along any closed contour C that does not approach the surface
closer than a few L at any point (see the dashed line in Fig. 4), so with j = 0 at all its points, we get
q
dr A dr 0 .
C C
(6.60)
The first integral, i.e. the difference of in the initial and final points, has to be equal to either zero or
an integer number of 2 because the change + 2n does not change the Cooper pair’s
condensate’s wavefunction:
' e i 2n e i . (6.61)
On the other hand, according to Eq. (5.65), the second integral in Eq. (60) is just the magnetic flux
through the contour.32 As a result, we get a wonderful result:
31 The material of this section is not covered in most E&M textbooks, and will not be used in later sections of this
course. Thus the “only” loss due to the reader’s skipping this section would be the lack of familiarity with one of
the most fascinating fields of physics. Note also that we already have virtually all formal tools necessary for its
discussion, so reading this section should not require much effort.
32 Due to the Meissner-Ochsenfeld effect, the exact path of the contour is not important, and we may discuss
just as the magnetic flux through the ring.
Chapter 6 Page 16 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Magnetic 2
flux n 0 , where 0 , with n 0, 1, 2,... , (6.62)
quantization q
saying that the magnetic flux inside any superconducting loop can only take values multiple of the flux
quantum 0. This effect, predicted in 1950 by the same Fritz London (who expected q to be equal to the
electron charge –e), was observed experimentally in 1961,33 but with q = 2e – so 0 2.0710-15 Wb.
Historically, this observation gave decisive support to the BCS theory of superconductivity (implying
Cooper pairs with charge q = –2e) that had been put forward just four years earlier.
Note the truly macroscopic character of this quantum effect: it has been repeatedly observed in
human-scale superconducting loops, and from what is known about superconductors, there is no doubt
that if we had made a giant superconducting wire loop extending, say, over the Earth’s equator, the
magnetic flux through it would still be quantized – though with a very large flux quanta number n. This
means that the quantum coherence of Bose-Einstein condensates may extend over, using H. Casimir’s
famous expression, “miles of dirty lead wire”. (Lead is a typical superconductor, with Tc 7.2 K, and
indeed retains its superconductivity even being highly contaminated by impurities.)
Moreover, hollow rings are not entirely necessary for flux quantization. In 1957, A. Abrikosov
explained the counter-intuitive high-field behavior of superconductors with L > 2, known
experimentally as their mixed (or “Shubnikov”) phase since the 1930s. He showed that a sufficiently
high magnetic field may penetrate such superconductors in the form of self-formed magnetic field
“threads” (or “tubes”) surrounded by vortex-shaped supercurrents – the so-called Abrikosov vortices. In
the simplest case, the core of such a vortex is a straight line, on which the superconductivity is
completely suppressed ( = 0), surrounded by circular, axially-symmetric, persistent supercurrents
j(), where is the distance from the vortex axis – see Fig. 5a. At the axis, the current vanishes, and
with the growth of , it first rises and then falls (with j() = 0), reaching its maximum at ~ , while
the magnetic field B(), directed along the vortex axis, is largest at = 0, and drops monotonically at
distances of the order of L (Fig. 5b).
(a) (b)
B
Fig. 6.5. The Abrikosov vortex:
B (a) a 3D structure’s sketch, and
j
j (b) the main variables as
functions of the distance
0 from the axis (schematically).
L
The total flux of the field equals exactly one flux quantum 0, given by Eq. (62).
Correspondingly, the wavefunction’s phase performs just one 2 revolution along any contour
drawn around the vortex’s axis, so = n/, where n is the azimuthal unit vector.34 This topological
feature of the wavefunction’s phase is sometimes called fluxoid quantization – to distinguish it from
33 Independently and virtually simultaneously by two groups: B. Deaver and W. Fairbank, and R. Doll and M.
Näbauer; their reports were published back-to-back in the same issue of the Physical Review Letters.
34 The last (perhaps, evident) expression for follows from MA Eq. (10.2) with f = + const.
Chapter 6 Page 17 of 38
Essential Graduate Physics EM: Classical Electrodynamics
magnetic flux quantization, which is valid only for relatively large contours, not approaching the axis by
distances ~L.
A quantitative analysis of Abrikosov vortices requires, besides the equations we have discussed,
one more constituent relation that would describe the suppression of the number of Cooper pairs
(quantified by 2) by the magnetic field – or rather by the field-induced supercurrent. In his original
work, Abrikosov used for this purpose the famous Ginzburg-Landau equation,35 which is quantitatively
valid only at T Tc. The equation may be conveniently represented in either of the following two forms:
2
1 q
i qA 2 a b 2 , * i A 1
2 2 2
, (6.63)
2m
where a and b are certain temperature-dependent coefficients, with a 0 at T Tc. The first of these
forms clearly shows that the Ginzburg-Landau equation (as well as the similar Gross-Pitaevskii equation
describing electrically-neutral Bose-Einstein condensates) belongs to a broader class of nonlinear
Schrödinger equations, differing from the usual Schrödinger equation, which is linear in , only by the
additional nonlinear terms. The equivalent, second form of Eq. (63) is more convenient for applications
and shows more clearly that if the superconductor’s condensate density, proportional to 2, is
suppressed only locally, it self-restores to its unperturbed value (with 2 = 1) at the distances of the
order of the coherence length /(2ma)1/2.
This fact enables a simple quantitative analysis of the Abrikosov vortex in the most important
limit << L. Indeed, as Fig. 5 shows, in this case. 2 = 1 at most distances ( ~ L) where the field
and current are distributed, so these distributions may be readily calculated without any further
involvement of Eq. (63), just from Eq. (54) with = n/, and the Maxwell equations (21) for the
magnetic field, giving B = j, and B = 0. Indeed, combining these equations just as this was
done at the derivation of Eq. (23), for the only Cartesian component of the vector B(r) = B()nz (where
the z-axis is directed along the vortex’ symmetry axis), we get a simple equation
L2 2 B B 0 2 ρ , at , (6.64)
q
which coincides with Eq. (56) at all regular points 0. Spelling out the Laplace operator for our
current case of axial symmetry,36 we get an ordinary differential equation,
1 d dB
L2 B 0, for 0 . (6.65)
d d
Comparing this equation with Eq. (2.155) with = 0, and taking into account that we need the solution
decreasing at , making any contribution proportional to the function I0 unacceptable, we get
35 This equation was derived by Vitaly Lazarevich Ginzburg and Lev Davidovich Landau from phenomenological
arguments in 1950, i.e. before the advent of the “microscopic” BSC theory, and may be used for simple analyses
of a broad range of nonlinear effects in superconductors. The Ginzburg-Landau and Gross-Pitaevskii equations
will be further discussed in SM Sec. 4.3.
36 See, e.g., MA Eq. (10.3) with / = /z = 0.
Chapter 6 Page 18 of 38
Essential Graduate Physics EM: Classical Electrodynamics
B CK 0 (6.66)
L
– see the plot of this Bessel function on the right panel of Fig. 2.22 (black line). The constant C should
be calculated by fitting the 2D delta function on the right-hand side of Eq. (64), i.e. by requiring
So the magnetic field of the vortex drops exponentially at distances much larger than L, and
diverges at 0 – see, e.g., the second of Eqs. (2.157). However, this divergence is very slow
(logarithmic), and, as was repeatedly discussed in this series, is avoided by the account of virtually any
other factor. In our current case, this factor is the decrease of 2 to zero at ~ (see Fig. 5), not taken
into account in Eq. (68). As a result, we may estimate the field on the axis of the vortex as
0
B 0 ln L ; (6.69)
2 L
2
the exact (and much more involved) solution of the problem confirms this estimate with a minor
correction: ln(L/) ln(L/) – 0.28, i.e. 1.3.
The current density distribution may be now calculated from the Maxwell equation B = j,
giving j = j()n, with38
1 B 0 0
j K 0 K 1 , at , (6.70)
2 L
2
L 2 L L
3
where the same identity (2.158), with Jn Kn and n = 1, was used. Now looking at Eqs. (2.157) and
(2.158), with n = 1, we see that the supercurrent’s density is exponentially low at >> L (thus outlining
the vortex’ periphery), and is proportional to 1/ within the broad range << << L. This rise of the
current at 0 (which could be readily predicted directly from Eq. (54) with = n/, and the A-
term negligible at << L) is quenched at ~ by a rapid drop of the factor 2 in the same Eq. (54),
i.e. by the suppression of the superconductivity near the axis (by the same supercurrent!) – see Fig. 5
again.
This structure of the Abrikosov vortex may be used to calculate, in a straightforward way, its
energy per unit length (i.e. its linear tension)
37 This fact follows, for example, from the integration of both sides of Eq. (2.143) (which is valid for any Bessel
functions, including Kn) with n = 1, from 0 to , and then using the asymptotic values given by Eqs. (2.157)-
(2.158): K1() = 0, and K1() 1/ at 0.
38 See, e.g., MA Eq. (10.5), with f = f = 0, and f = B().
z
Chapter 6 Page 19 of 38
Essential Graduate Physics EM: Classical Electrodynamics
U 02
T ln L , (6.71)
l 4 L2
and hence the so-called “first critical” value Hc1 of the external magnetic field,39 at which the vortex
formation becomes possible (in a long cylindrical sample parallel to the field):
T 0 L
H c1 ln . (6.72)
0 4 2
L
Let me leave the proof of these two formulas for the reader’s exercise.
The flux quantization and the Abrikosov vortices discussed above are just two of several
macroscopic quantum effects in superconductivity. Let me discuss just one more, but perhaps the most
interesting of such effects. Let us consider a superconducting ring/loop interrupted with a very narrow
slit (Fig. 4b). Integrating Eq. (54) along any current-free path from point 1 to point 2 (see, e.g., dashed
line in Fig. 4b), we get
2
q q
0 A dr 2 1 Φ. (6.73)
1
Using the flux quantum definition (62), this result may be rewritten as
2 Josephson
1 2 , (6.74) phase
0 difference
where is called the Josephson phase difference. Note that in contrast to each of the phases 1,2, their
difference is gauge-invariant: Eq. (74) directly relates it to the gauge-invariant magnetic flux .
Can this be measured? Yes, for example, using the Josephson effect.40 Let us consider two (for
the argument simplicity, similar) superconductors, connected with some sort of weak link, for example,
a small tunnel junction, or a point contact, or a narrow thin-film bridge, through which a weak Cooper-
pair supercurrent can flow. (Such a system of two weakly coupled superconductors is called a Josephson
junction.) Let us think about what this supercurrent I may be a function of. For that, reverse thinking is
helpful: let us imagine that we change the current; what parameter of the superconducting condensate
can it affect? If the current is very weak, it cannot perturb the superconducting condensate’s density,
proportional to 2; hence it may only change the Cooper condensate phases 1,2. However, according
to Eq. (53), the phases are not gauge-invariant, while the current should be. Hence the current may
affect (or, if you like, may be affected by) only the phase difference defined by Eq. (74). Moreover,
just has already been argued during the flux quantization discussion, a change of any of 1,2 (and hence
of ) by 2 or any of its multiples should not change the current. Also, if the wavefunction is the same
in both superconductors ( = 0), the supercurrent should vanish due to the system’s symmetry. Hence
the function I() should satisfy the following conditions:
39 This term is used to distinguish Hc1 from the higher “second critical field” Hc2, at which the Abrikosov vortices
are pressed to each other so tightly (to distances d ~ ) that they merge, and the remains of superconductivity
vanish: 0. Unfortunately, I do not have time/space to discuss these effects; the interested reader may be
referred, for example, to Chapter 5 of M. Tinkham’s monograph cited above.
40 It was predicted in 1961 by Brian David Josephson (then a PhD student!) and observed experimentally by
several groups soon after that.
Chapter 6 Page 20 of 38
Essential Graduate Physics EM: Classical Electrodynamics
This effect of a periodic dependence of the current on the magnetic flux is called macroscopic quantum
interference,42 while the system shown in Fig. 4c, the superconducting quantum interference device –
SQUID (with all letters capitalized, please :-). The low value of the magnetic flux quantum 0, and
hence the high sensitivity of to external magnetic fields, allows using such SQUIDs as ultrasensitive
magnetometers. Indeed, for a superconducting ring of area ~1 cm2, one period of the change of the
supercurrent (77) is produced by a magnetic field change of the order of 10-11 T (10-7 Gs), while
sensitive electronics allows measuring a tiny fraction of this period – limited by thermal noise at a level
of the order of a few fT. Such sensitivity allows measurements, for example, of the miniscule magnetic
fields induced outside of the body by the beating human heart, and even by brain activity.43
An important aspect of quantum interference is the so-called Aharonov-Bohm (AB) effect –
which actually takes place for single quantum particles as well.44 Let the magnetic field lines be limited
to the central, hollow part of the SQUID loop so that no appreciable magnetic field ever touches the ring
itself. (This may be done experimentally with very good accuracy, for example using high- magnetic
cores – see their discussion in Sec. 5.6.) As predicted by Eq. (77), and confirmed by several careful
experiments carried out in the mid-1960s,45 this restriction does not matter – the interference is observed
41 For some other types of weak links, the function I() may deviate from the sinusoidal form Eq. (76) rather
considerably, while still satisfying the general conditions (75).
42 The name is due to a deep analogy between this phenomenon and the interference between two coherent waves,
to be discussed in detail in Sec. 8.4.
43 Other practical uses of SQUIDs include MRI signal detectors, high-sensitive measurements of magnetic
properties of materials, and weak field detection in a broad variety of physical experiments – see, e.g., J. Clarke
and A. Braginski (eds.), The SQUID Handbook, vol. II, Wiley, 2006. For a comparison of these devices with
other sensitive magnetometers see, e.g., the review collection by A. Grosz et al. (eds.), High Sensitivity
Magnetometers, Springer, 2017.
44 For a more detailed discussion of the AB effect see, e.g., QM Sec. 3.2.
45 Similar experiments have been carried out with single (unpaired) electrons – moving either ballistically, in
vacuum, or in “normal” (non-superconducting) conducting rings. In the last case, the effect is much harder to
observe than in SQUIDs: the ring size has to be very small, and temperature very low, to avoid the so-called
Chapter 6 Page 21 of 38
Essential Graduate Physics EM: Classical Electrodynamics
anyway. This means that not only the magnetic field B but also the vector potential A represents
physical reality, albeit in a quite peculiar way – remember the gauge transformation (5.46), which you
may carry out in your head, without changing any physical reality? (Fortunately, this transformation
does not change the contour integral participating in Eq. (5.65), and hence the magnetic flux , and
hence the interference pattern.)
Actually, the magnetic flux quantization (62) and the macroscopic quantum interference (77) are
not completely different effects, but just two manifestations of the interrelated macroscopic quantum
phenomena. To show that, one should note that if the critical current Ic (or rather its product by the
loop’s self-inductance L) is high enough, the flux in the SQUID loop is due not only to the external
magnetic field flux ext but also has a self-field component – cf. Eq. (5.68):46
Now the relation between and ext may be readily found by solving this equation together with Eq.
(77). Figure 6 shows this relation for several values of the dimensionless parameter 2LIc/0.
2LI c
0.3 3 10
0
1
0
dephasing effects due to unavoidable interactions of the electrons with their environment – see, e.g., QM Chapter
7.
46 The sign before LI would be positive, as in Eq. (5.70), if I was the current flowing into the inductance.
However, in order to keep the sign in Eq. (76) intact, I should mean the current flowing into the Josephson
junction, i.e. from the inductance, thus changing the sign of the LI term in Eq. (78).
Chapter 6 Page 22 of 38
Essential Graduate Physics EM: Classical Electrodynamics
computing. Indeed, Fig. 6 shows that at the values of modestly above 1 (e.g., 3), and within a
certain range of applied field, the SQUID has two stable flux states, which differ by 0 and may
be used for coding binary 0 and 1. For practical superconductors (like Nb), the time of switching
between these states (see dashed arrows in Fig. 4) is of the order of a picosecond, while the energy
dissipated at such event may be as low as ~10-19 J. (This bound is determined not by device’s physics,
by the fundamental requirement for the energy barrier between the two states to be much higher than the
thermal fluctuation energy scale kBT, ensuring a sufficiently long information retention time.) While the
picosecond switching speed may be also achieved with some semiconductor devices, the power
consumption of the SQUID-based digital devices may be 5 to 6 orders of magnitude lower, enabling
large-scale digital integrated circuits with 100-GHz-scale clock frequencies. Unfortunately, the range of
practical applications of these Rapid Single-Flux-Quantum (RSFQ) digital circuits is still very narrow,
due to the inconvenience of their deep refrigeration to temperatures below Tc.47
Since we have already got the basic relations (74) and (76) describing the macroscopic quantum
phenomena in superconductivity, let me mention in brief two other prominent members of this group,
called the dc and ac Josephson effects. Differentiating Eq. (74) over time, and using the Faraday
induction law (2), we get48
Josephson
d 2e
phase-to-
voltage
V. (6.79)
relation
dt
This famous Josephson phase-to-voltage relation should be valid regardless of the way how the voltage
V has been created,49 so let us apply Eqs. (76) and (79) to the simplest circuit with a non-
superconducting source of dc voltage – see Fig. 7.
1 2
I (t )
47 For more on that technology, see, e.g., the review paper by P. Bunyk et al., Int. J. High Speed Electron. Syst.
11, 257 (2001), and references therein.
48 Since the induced e.m.f. V
ind cannot drop on the superconducting path between the Josephson junction
electrodes 1 and 2 (see Fig. 4c), it should be equal to (-V), where V is the voltage across the junction.
49 Indeed, it may be also obtained from simple Schrödinger-equation-based arguments – see, e.g., QM Sec. 1.6.
Chapter 6 Page 23 of 38
Essential Graduate Physics EM: Classical Electrodynamics
assume that this is the only voltage component: V(t) = V0 = const;50 then Eq. (79) may be easily
integrated to give = Jt + 0, where
2e Josephson
J V0 . (6.81) oscillation
frequency
This result, plugged into Eq. (76), shows that the supercurrent oscillates,
I I c sin J t 0 , (6.82)
with the so-called Josephson frequency J (81) proportional to the applied dc voltage. For practicable
voltages (above the typical noise level), the frequency fJ = J/2 corresponds to the GHz or even THz
ranges, because the proportionality coefficient in Eq. (81) is very high: fJ/V0 = e/ 483 MHz/V.51
An important experimental fact is the universality of this coefficient. For example, in the mid-
1980s, a Stony Brook group led by J. Lukens proved that this factor is material-independent with a
relative accuracy of at least 10-15. Very few experiments, especially in solid-state physics, have ever
reached such precision. This fundamental nature of the Josephson voltage-to-frequency relation (81)
allows an important application of the ac Josephson effect in metrology. Namely, phase-locking52 the
Josephson oscillations with an external microwave signal from an atomic frequency standard, one can
get a more precise dc voltage than from any other source. In NIST and other metrological institutions
around the globe, this effect is used for the calibration of simpler “secondary” voltage standards that can
operate at room temperature.
V E dr (6.83)
between the coil terminals along any path outside the coil. This voltage has to be balanced by the
induction e.m.f. (2) in the coil, so if the Ohmic resistance of the coil is negligible, we may write
50 In experiment, this condition is hard to implement, due to the relatively high inductances of the current leads
providing the dc voltage supply. However, this technical complication does not affect the main conclusion of the
simple analysis described here.
51 This 1962 prediction (by the same B. Josephson) was confirmed experimentally – in 1963 indirectly, by phase-
locking of the oscillations (82) with an external microwave signal, and in 1967 explicitly, by the direct detection
of the emitted microwave radiation.
52 For a discussion of this very important (and general) effect, see, e.g., CM Sec. 5.4.
Chapter 6 Page 24 of 38
Essential Graduate Physics EM: Classical Electrodynamics
d
V , (6.84)
dt
where is the magnetic flux in the coil.53 If the flux is due to the current I in the same coil only (i.e. if
it is magnetically uncoupled from other coils), we may use Eq. (5.70) to get the well-known relation
Voltage
drop on dI
inductance V L , (6.85)
coil dt
where compliance with the Lenz sign rule is achieved by selecting the relations between the assumed
voltage polarity and the current direction as shown in Fig. 8a.
If similar conditions are satisfied for two magnetically coupled coils (Fig. 8b), then, in Eq. (84),
we need to use Eqs. (5.69) instead, getting
dI 1 dI dI 2 dI
V1 L1 M 2 , V 2 L2 M 1 . (6.86)
dt dt dt dt
Such systems of inductively coupled coils have numerous applications in electrical engineering and
physical experiment. Perhaps the most important of them is the ac transformer, in which the coils share
a common soft-ferromagnetic core of the toroidal (“doughnut”) topology – see Fig. 8c.54 As we already
know from the discussion in Sec. 5.6, such cores, with >> 0, “try” to absorb all magnetic field lines,
so the magnetic flux (t) in the core is nearly the same in each of its cross-sections. With this, Eq. (84)
yields
d d
V1 N 1 , V2 N 2 , (6.87)
dt dt
so the voltage ratio is completely determined by the ratio N1/N2 of the number of wire turns.
Now we may generalize, to the ac current case, the Kirchhoff laws already discussed in Sec. 4.1
– see Fig. 4.3 reproduced in Fig. 9a below. Let not only inductances but also capacitances and
resistances of the wires be negligible in comparison with those of the lumped (compact) circuit
elements, whose list now would include not only resistors and current sources (as in the dc case), but
also the induction coils (including magnetically coupled ones) and capacitors – see Fig. 9b. In the
quasistatic approximation, the current flowing in each wire is conserved, so the “node rule”, i.e. the 1st
Kirchhoff law (4.7a),
53 If the resistance is substantial, it may be represented by a separate lumped circuit element (resistor) connected
in series with the coil.
54 The first practically acceptable form of this device, called the Stanley transformer, was invented in 1886. In it,
multi-turn windings could be easily mounted onto a toroidal ferromagnetic (at that time, silicon-steel-plate) core.
Chapter 6 Page 25 of 38
Essential Graduate Physics EM: Classical Electrodynamics
I j
j 0. (6.88a)
remains valid. Also, if the electromagnetic induction effect is restricted to the interior of lumped
induction coils as discussed above, the voltage drops Vk across each circuit element may be still
represented, just as in dc circuits, with differences between the adjacent node potentials. As a result, the
“loop rule”, i.e. 2nd Kirchhoff law (4.7b),
Vk 0 ,
k
(6.88b)
is also valid. Now, in contrast to the dc case, Eqs. (88) may be the (ordinary) differential equations.
However, if all circuit elements are linear (as in the examples presented in Fig. 9b), these equations may
be readily reduced to linear algebraic equations, using the Fourier expansion. (In the common case of
sinusoidal ac sources, the final stage of the Fourier series summation is unnecessary.)
(a) (b)
“circuit “wire”
element”
“node” ~
dI 1
C
V L V RI V Idt V V (t )
dt
Fig. 6.9. (a) A typical quasistatic ac circuit obeying the
Kirchhoff laws, and (b) the simplest lumped circuit
“loop” elements.
My teaching experience shows that the potential readers of these notes are well familiar with the
application of Eqs. (88) to such problems from their undergraduate studies, so I will save time/space by
skipping discussions of even the simplest examples of such circuits, such as LC, LR, RC, and LRC loops
and periodic structures.55 However, since such problems are very important for practice, my sincere
advice to the reader is to carry out a self-test by solving a few problems of this type, provided in Sec. 9
below, and if they cause any difficulty, pursue some remedial reading.
But, as the divergence of any curl,56 the left-hand side should equal zero. Hence we get
55 Curiously enough, these effects include wave propagation in periodic LC circuits, even within the quasistatic
approximation! However, the speed 1/(LC)1/2 of these waves in lumped circuits is much lower than the speed
1/()1/2 of electromagnetic waves in the surrounding medium – see Sec. 8 below.
56 Again, see MA Eq. (11.2) – if you need it.
Chapter 6 Page 26 of 38
Essential Graduate Physics EM: Classical Electrodynamics
j 0. (6.90)
This is fine in statics, but in dynamics, this equation forbids any charge accumulation, because
according to the continuity relation (4.5),
j . (6.91)
t
This discrepancy had been recognized by James Clerk Maxwell who suggested, in the 1860s, a
way out of this contradiction. If we generalize the equation for H by adding to the term j (that
describes the density of real electric currents) the so-called displacement current density term,
Displacement D
current jd , (6.92)
density t
(which of course vanishes in statics), then the equation takes the form
D
H j jd j . (6.93)
t
In this case, due to the equation (3.22), D = , the divergence of the right-hand side equals zero due
to the continuity equation (92), and the discrepancy is removed. This incredible theoretical feat,57
confirmed by the 1886 experiments carried out by Heinrich Hertz (see below) was perhaps the main
triumph of theoretical physics of the 19th century.
Maxwell’s displacement current concept, expressed by Eq. (93), is so important that it is
worthwhile to have one more look at its derivation using a particular model shown in Fig. 10.58
C
Q Q
I I
Neglecting the fringe field effects, we may use Eq. (4.1) to describe the relationship between the
current I flowing through the wires and the electric charge Q of the capacitor:59
dQ
I. (6.94)
dt
57 It looks deceivingly simple now – after the fact, and with the current mathematical tools (especially the del
operator), which are much superior to those that were available to J. Maxwell.
58 No physicist should be ashamed of doing this. For example, J. Maxwell’s main book, A Treatise of Electricity
and Magnetism, is full of drawings of plane capacitors, inductance coils, and voltmeters. More generally, the
whole history of science teaches us that snobbery regarding particular examples and practical systems is a
virtually certain path toward producing nothing of either practical value or fundamental importance.
59 This is of course just the integral form of the continuity equation (91).
Chapter 6 Page 27 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Now let us consider a closed contour C drawn around the wire. (Solid points in Fig. 10 show the places
where the contour intercepts the plane of the drawing.) This contour may be seen as the line limiting
either surface S1 (crossed by the wire) or surface S2 (avoiding such crossing by passing through the
capacitor’s gap). Applying the macroscopic Ampère law (5.116) to the former surface, we get
H dr j d rI,
2
n (6.95)
C S1
while for the latter surface the same law gives a different result,
H dr j d r 0 ,
2
n [WRONG!] (6.96)
C S2
for the same integral. This is just an integral-form manifestation of the discrepancy outlined above, but it
shows clearly how serious the problem is (or rather it was – before Maxwell).
Now let us see how the introduction of the displacement currents saves the day, considering for
the sake of simplicity a plane capacitor of area A, with a small and constant electrode spacing. In this
case, as we already know, the field inside it is uniform, with D = , so the total capacitor’s charge Q =
A = AD, and the current (94) may be represented as
dQ dD
I A . (6.97)
dt dt
So, instead of the wrong Eq. (96), the Ampère law modified following Eq. (93), gives
Dn 2 dD
H dr ( j ) d 2r
d n d r A I, (6.98)
C S2 S2 t dt
i.e. the Ampère integral becomes independent of the choice of the surface limited by the contour C – as
it has to be.
60This vector form of the Maxwell equations, magnificent in its symmetry and simplicity, was developed in
1884-85 by Oliver Heaviside, with substantial contributions by H. Lorentz. (The original Maxwell’s result circa
1864 looked like a system of 20 equations for Cartesian components of the vector and scalar potentials.)
Chapter 6 Page 28 of 38
Essential Graduate Physics EM: Classical Electrodynamics
equations are believed to be strictly valid as relations between the Heisenberg operators of the electric
and magnetic fields.61 (Note that the microscopic Maxwell equations for the genuine fields E and B may
be formally obtained from Eqs. (99) by the substitutions D = 0E and H = B/0, and the simultaneous
replacement of the stand-alone charge and current densities on their right-hand sides with the full ones.)
Perhaps the most striking feature of these equations is that, even in the absence of stand-alone
charges and currents inside the region of our interest, when the equations become fully homogeneous,
B D
E , H , (6.100a)
t t
D 0, B 0, (6.100b)
they still describe something very non-trivial: electromagnetic waves, including light. The physics of the
waves may be clearly seen from Eqs. (100a): according to the first of them, the change of the magnetic
field in time creates a vortex-like (divergence-free) electric field. On the other hand, the second of Eqs.
(100a) describes how the changing electric field, in turn, creates a vortex-like magnetic field. So-
coupled electric and magnetic fields may propagate as waves – even very far from their sources.
We will carry out a detailed quantitative analysis of the waves in the next chapter, and here I will
only use this notion to make good on the promise given in Sec. 3, namely to establish the condition of
validity of the quasistatic approximation (21). For simplicity, let us consider an electromagnetic wave
with a time period T, velocity v, and hence the wavelength62 = vT in a linear medium with D = E, B =
H. Then the magnitude of the left-hand side of the first of Eqs. (100a) is of the order of E/ = E/vT,
while that of its right-hand side may be estimated as B/T ~ H/T. Using similar estimates for the second
of Eqs. (100a), we arrive at the following two requirements:63
E 1
~ v ~ . (6.101)
H v
To ensure the compatibility of these two relations, the wave’s speed should satisfy the estimate
1
v~ , (6.102)
1 / 2
reduced to v ~ 1/(00)1/2 c in free space, while the ratio of the electric and magnetic field amplitudes
should be of the following order:
1/ 2
E 1
~ v ~ . (6.103)
H 1/ 2
(In the next chapter we will see that for plane electromagnetic waves, these results are exact.)
Now, let a system of a linear size ~a carry currents producing a certain magnetic field H. Then,
according to Eqs. (100a), their magnetic field Faraday-induces the electric field of magnitude E ~
Ha/T, whose displacement currents, in turn, produce an additional magnetic field with magnitude
Chapter 6 Page 29 of 38
Essential Graduate Physics EM: Classical Electrodynamics
2 2
a a a a a
H' ~ E~ H H H . (6.104)
T T T vT
Hence, the displacement current effects are negligible for a system of size a << .64
In particular, the quasistatic picture of the skin effect, discussed in Sec. 3, is valid while the skin
depth (33) remains much smaller than the corresponding wavelength,
1/ 2
2v
4 2
vT . (6.105)
2
The wavelength decreases with the frequency as 1/, i.e. faster than s 1/1/2, so they become
comparable at the crossover frequency
r , (6.106)
0
which is nothing else than the reciprocal charge relaxation time (4.10). As was discussed in Sec. 4.2, for
good metals this frequency is extremely high (about 1018 s-1), so the validity of Eq. (33) is typically
limited by the anomalous skin effect (which was briefly discussed in Sec. 3), rather than the wave
effects.
Before going after the analysis of the full Maxwell equations for particular situations (that will
be the main goal of the next chapters of this course), let us have a look at the energy balance they yield
for a certain volume V, which may include both some charged particles and the electromagnetic field.
Since, according to Eq. (5.10), the magnetic field performs no work on charged particles even if they
move, the total power P being transferred from the field to the particles inside the volume is due to the
electric field alone – see Eq. (4.38):
P p d 3 r, with p j E , (6.107)
V
Expressing j from the corresponding Maxwell equation of the system (99), we get
D
P E ( H) E d 3 r. (6.108)
V
t
Let us pause here for a second, and transform the divergence of EH, using the well-known vector
algebra identity:65
E H H E E H . (6.109)
The last term on the right-hand side of this equality is exactly the first term in the square brackets of Eq.
(108), so we may rewrite that formula as
D
P E H H E E d 3 r. (6.110)
V
t
64 Let me emphasize that if this condition is not fulfilled, the lumped-circuit representation of the system (see Fig.
9 and its discussion) is typically inadequate – besides some special cases, to be discussed in the next chapter.
65 See, e.g., MA Eq. (11.7) with f = E and g = H.
Chapter 6 Page 30 of 38
Essential Graduate Physics EM: Classical Electrodynamics
However, according to the Maxwell equation for E, this curl is equal to –B/t, so the second term
in the square brackets of Eq. (110) equals –HB/t and, according to Eq. (14), is just the (minus) time
derivative of the magnetic energy per unit volume. Similarly, according to Eq. (3.76), the third term
under the integral is the (minus) time derivative of the electric energy per unit volume. Finally, we can
use the divergence theorem to transform the integral of the first term in the square brackets to a 2D
integral over the surface S limiting the volume V. As a result, we get the so-called Poynting theorem66
for the power balance in the system:
u
p t d r Snd 2r 0 .
Poynting 3
theorem (6.111)
V S
Here u is the density of the total (electric plus magnetic) energy of the electromagnetic field, with
Field’s
energy u E D H B (6.112)
variation
– just the sum of the expressions given by Eqs. (3.76) and (14). For the particular case of an isotropic,
linear, and dispersion-free medium, with D(t) = E(t), B(t) = H(t), Eq. (112) yields
Field’s E D H B E 2 B 2
energy u . (6.113)
2 2 2 2
Another key notion participating in Eq. (111) is the Poynting vector, defined as67
Poynting
vector S EH . (6.114)
The first integral in Eq. (111) is evidently the net change of the energy of the system (particles + field)
per unit time, so the second (surface) integral has to be the power flowing out from the system through
the surface. As a result, it is tempting to interpret the Poynting vector S locally, as the power flow
density at the given point. In many cases, such a local interpretation of vector S is legitimate; however,
in other cases, it may lead to wrong conclusions. Indeed, let us consider the simple system shown in Fig.
11: a charged plane capacitor placed into a static and uniform external magnetic field, so that the electric
and magnetic fields are mutually perpendicular.
S S
E B
Fig. 6.11. The Poynting vector paradox.
In this static situation, with no charges moving, both p and /t are equal to zero, and there
should be no power flow in the system. However, Eq. (114) shows that the Poynting vector is not equal
66 It is named after John Henry Poynting for his work published in 1884, though this fact was independently
discovered by O. Heaviside in 1885 in a simpler form, while a similar result for the intensity of mechanical elastic
waves had been obtained earlier (in 1874) by Nikolay Alekseevich Umov – see, e.g., CM Sec. 7.7.
67 Actually, an addition to S of the curl of an arbitrary vector function f(r, t) does not change Eq. (111). Indeed,
we may use the divergence theorem to transform the corresponding change of the surface integral in Eq. (111) to a
volume integral of scalar function (f) that equals zero at any point – see, e.g., MA Eq. (11.2).
Chapter 6 Page 31 of 38
Essential Graduate Physics EM: Classical Electrodynamics
to zero inside the capacitor, being directed as the red arrows in Fig. 11 show. From the point of view of
the only unambiguous corollary of the Maxwell equations, Eq. (111), there is no contradiction here,
because the fluxes of the vector S through the side boundaries of the volume shaded in Fig. 11 are equal
and opposite (and they are zero for other faces of this rectilinear volume), so the total flux of the
Poynting vector through the volume boundary equals zero, as it should. It is, however, useful to recall
this example each time before giving a local interpretation of the vector S.
The paradox illustrated in Fig. 11 is closely related to the radiation recoil effects, due to the
electromagnetic field’s momentum – more exactly, it linear momentum. Indeed, acting as at the
Poynting theorem derivation, it is straightforward to use the microscopic Maxwell equations68 to prove
that, neglecting the boundary effects, the vector sum of the mechanical linear momentum of the particles
in an arbitrary volume, and the integral of the following vector,
Electro-
S magnetic
g 2 , (6.115) field’s
c momentum
over the same volume, is conserved, enabling an interpretation of g as the density of the linear
momentum of the electromagnetic field. (It will be more convenient for me to prove this relation, and
discuss the related issues, in Sec. 9.8, using the 4-vector formalism of special relativity.) Due to this
conservation, if some static fields coupled to mechanical bodies are suddenly decoupled from them and
are allowed to propagate in space, i.e. to change their local integral of g, they give the bodies an equal
and opposite impulse of force.
Finally, to complete our initial discussion of the Maxwell equations,69 let us rewrite them in terms of
potentials A and , because this is more convenient for the solution of some (though not all!) problems.
Even when dealing with the system (99) of the more general Maxwell equations than discussed before,
Eqs. (7) are still used for the definition of the potentials. It is straightforward to verify that with these
definitions, the two homogeneous Maxwell equations (99b) are satisfied automatically. Plugging Eqs.
(7) into the inhomogeneous equations (99a), and considering, for simplicity, a linear, uniform medium
with frequency-independent and , we get
2
A , 2 A
2A
A
j. (6.116)
t t 2
t
This is a more complex result than what we would like to get. However, let us select a special
gauge, which is frequently called (especially for the free space case, when v = c) the Lorenz gauge
condition70
Lorenz
A 0, (6.117) gauge
t condition
68 The situation with the macroscopic Maxwell equations is more complex, and is still a subject of some lingering
discussions (usually called the Abraham-Minkowski controversy, despite contributions by many other scientists
including A. Einstein), because of the ambiguity of the momentum’s division between its field and particle
components – see, e.g., the review paper by R. Pfeiffer et al., Rev. Mod. Phys. 79, 1197 (2007).
69 We will return to their general discussion (in particular, to the analytical mechanics of the electromagnetic
field, and its stress tensor) in Sec. 9.8, after we get equipped with the special relativity theory.
70 This condition, named after Ludwig Lorenz, should not be confused with the so-called Lorentz invariance
condition of relativity, due to Hendrik Lorentz, to be discussed in Sec. 9.4. (Note the last names’ spelling.)
Chapter 6 Page 32 of 38
Essential Graduate Physics EM: Classical Electrodynamics
which is a natural generalization of the Coulomb gauge (5.48) to time-dependent phenomena. With this
condition, Eqs. (107) are reduced to a simpler, beautifully symmetric form:
Potentials’ 1 2 1 2A
dynamics 2 , 2A j , (6.118)
v t
2 2
v 2 t 2
where v2 1/. Note that these equations are essentially a set of 4 similar equations for 4 scalar
functions (namely, and three Cartesian components of A) and thus clearly invite the 4-component
vector formalism of the relativity theory; it will be discussed in Chapter 9.71
If and A depend on just one spatial coordinate, say z, then in a region without field sources:
= 0, j = 0, Eqs. (118) are reduced to the following 1D wave equations
2 1 2 2A 1 2A
0, 0. (6.119)
2 z v 2 t 2 2 z v 2 t 2
It is well known72 that these equations describe waves, with arbitrary waveforms (including sinusoidal
waves of any frequency), propagating with the same speed v in either of the z-axis directions.
According to the definitions of the constants 0 and 0, in free space, v is just the speed of light:
1
v c. (6.120)
0 0 1 / 2
Historically, the experimental observation of relatively low-frequency (GHz-scale) electromagnetic
waves, with their speed equal to that of light, was the decisive proof (actually, a real triumph!) of the
Maxwell theory and his prediction of such waves.73 This was first accomplished in 1886 by Heinrich
Rudolf Hertz, using the electronic circuits and antennas he had invented for this purpose.
Before proceeding to the detailed analysis of these waves in the following chapters, let me
mention that the invariance of Eqs. (119) with respect to the wave propagation direction is not
occasional; it is just a manifestation of one more general property of the Maxwell equations (99), called
the Lorentz reciprocity. We have already met its simplest example, for time-independent electrostatic
fields, in one of the problems of Chapter 1. In a much more general case when two monochromatic
electromagnetic fields of the same frequency, with complex amplitudes, say, {E1(r), H1(r)} and {E2(r),
71 Here I have to mention in passing the so-called Hertz vector potentials e and m (whose introduction may be
traced back at least to the 1904 work by E. Whittaker). They may be defined by the following relations:
Π e 1
A Π m , Πe ,
t
which make the Lorentz gauge condition (117) automatically satisfied. These potentials are especially convenient
for the solution of problems in which the electromagnetic field is induced by sources characterized by field-
independent electric and magnetic polarizations P and M – rather than by field-independent charge and current
densities and j. Indeed, it is straightforward to check that both e and m satisfy the equations similar to Eqs.
(118), but with their right-hand sides equal to, respectively, –P and –M. Unfortunately, I would not have
time/space to discuss such problems and have to refer interested readers elsewhere – for example, to a classical
text by J. Stratton, Electromagnetic Theory, Adams Press, 2008.
72 See, e.g., CM Secs. 6.3-6.4 and 7.7-7.8.
73 By that time, the speed of light (estimated very reasonably by Ole Rømer as early as 1676) has been
experimentally measured, by Hippolyte Fizeau and then Léon Foucault, with an accuracy better than 1%.
Chapter 6 Page 33 of 38
Essential Graduate Physics EM: Classical Electrodynamics
H2(r)} are induced, separately, by stand-alone currents with complex amplitudes j1(r) and j2(r) of their
densities. Then it may be proved74 that if the medium is linear and either isotropic or even anisotropic
but with symmetric tensors jj’ and jj’, then for any volume V limited by a closed surface S,
V
j
1 E 2 j 2 E1 d 3 r E1 H 2 E 2 H 1 n d 2 r .
S
(6.121)
This property implies, in particular, that the waves propagate similarly in two reciprocal
directions even in situations much more general than the 1D case described by Eqs. (119). For some
important practical applications (e.g., for low-noise amplifiers and detectors) such reciprocity is rather
inconvenient. Fortunately, Eq. (121) may be violated in anisotropic media with asymmetric tensors jj’
and/or jj’. The simplest case of such an anisotropy, the Faraday rotation of the wave polarization in
plasma, will be discussed in the next chapter.
6.2. The flux of the magnetic field that pierces a resistive ring V ?
is being changed in time, while the field outside of the ring is negligibly
low. A voltmeter is connected to a part of the ring, as shown in the Φ(t )
figure on the right. What would the voltmeter show?
6.3. A weak constant magnetic field B is applied to an axially-symmetric permanent magnet with
the dipole magnetic moment m directed along its axis, rapidly rotating about the same axis, with an
angular momentum L. Calculate the electric field resulting from the magnetic field’s application, and
formulate the conditions of your result’s validity.
6.4. The similarity of Eq. (5.53) obtained in Sec. 5.3 without any use of the Faraday induction
law, and Eq. (5.54) proved in Sec. 2 of this chapter using it, implies that the law may be derived from
magnetostatics. Prove that this is indeed true for a particular case of a current loop being slowly
deformed in a fixed magnetic field B(r).
74It will be more convenient for me to give this proof (or rather offer it for the reader’s exercise :-) in the next
chapter, after we have discussed the Fourier expansion of the fields in linear media.
Chapter 6 Page 34 of 38
Essential Graduate Physics EM: Classical Electrodynamics
6.6. Use energy arguments to calculate the pressure exerted by the magnetic field B inside a long
uniform solenoid of length l, and a cross-section of area A << l2, with N >> l/A1/2 >> 1 turns, on its
“walls” (windings), and the forces exerted by the field on the solenoid’s ends, for two cases:
(i) the current through the solenoid is fixed by an external source, and
(ii) after the initial current setting, the ends of the solenoid’s wire, with negligible resistance, are
connected, so that it continues to carry a non-zero current.
Compare the results, and give a physical interpretation of the direction of these forces.
6.9. A planar thin-wire loop with inductance L, resistance R, and area A is launched to fly
ballistically from field-free space into a region where the magnetic field B is constant. Calculate the
final change of the kinetic energy of the loop, assuming that the time of its entry into the field region is
much shorter than the relaxation time constant L/R and that the loop cannot rotate.
6.10. AC current of frequency is being passed through a long uniform wire with a round cross-
section of a radius R comparable with the skin depth s. In the quasistatic approximation, find the
current’s distribution across the cross-section, and analyze it in the limits R << s and s << R. Calculate
the effective ac resistance of the wire (per unit length) in these two limits.
6.11. A long round cylinder of radius R, made of a uniform conductor with an Ohmic
conductivity and magnetic permeability , is placed into a uniform ac magnetic field Hext(t) =
H0cost directed along its symmetry axis. Calculate the spatial distribution of the magnetic field’s
amplitude and, in particular, its value on the cylinder’s axis. Spell out the last result in the limits of
relatively small and large R.
Chapter 6 Page 35 of 38
Essential Graduate Physics EM: Classical Electrodynamics
6.12.* Define and calculate an appropriate spatial-temporal Green’s function for Eq. (25), and
then use this function to analyze the dynamics of propagation of the external magnetic field that is
suddenly turned on at t = 0 and then kept constant:
0, at t 0,
H x 0, t
H 0 , at t 0,
into an Ohmic conductor occupying the semi-space x > 0 – see Fig. 2.
Hint: Try to use a function proportional to exp{–(x–x’)2/2(x)2}, with a suitable time
dependence of the parameter x and a properly selected pre-exponential factor.
6.13. Solve the previous problem using the variable separation method, and compare the results.
6.18. Use the London equation to analyze the penetration of a uniform external magnetic field
into a thin (t ~ L) planar superconducting film whose plane is parallel to the field.
Chapter 6 Page 36 of 38
Essential Graduate Physics EM: Classical Electrodynamics
6.19. Use the London equation to calculate the distribution of supercurrent density j inside a long
straight superconducting wire with a circular cross-section of radius R ~ L, carrying current I.
6.22. Use the London equation to analyze the magnetic field shielding by a superconducting thin
film of thickness t << L, by calculating the penetration of the field induced by current I in a thin wire
that runs parallel to a wide planar thin film, at a distance d >> t from it, into the space behind the film.
6.23. Assuming that the magnetic monopole does exist and has a magnetic charge qm, calculate
the change I of current in a superconducting loop due to a passage of a single monopole through its
area. Evaluate I for a monopole with the charge conjectured by P. Dirac, qm = nq0 n(2/e) with an
integer n, and compare the result with the magnetic flux quantum 0 (62). Review your result for a
similar passage of a single quasi-monopole magnetic charge formed at one of the ends of a permanent-
magnet needle – see, e.g., Fig. 19 and the accompanying discussion.
Hint: To simplify calculations, you may consider the monopole’s passage along the symmetry
axis of a round ring of radius R, made of a superconducting wire with a cross-section’s area A satisfying
the conditions L2 << A << R2.
6.24. Use the Ginzburg-Landau equations (54) and (63) to calculate the largest (“critical”) value
of supercurrent in a uniform superconducting wire with a cross-section area much smaller than L2.
6.25. Use the discussion of a long straight Abrikosov vortex, in the limit << L, in Sec. 5 to
prove Eqs. (71)-(72) for its energy per unit length and the first critical field.
6.26.* Use the Ginzburg-Landau equations (54) and (63) to prove the Josephson relation (76) for
a small superconducting weak link, and express its critical current Ic via the Ohmic resistance Rn of the
same weak link in its normal state.
6.27. Use Eqs. (76) and (79) to calculate the coupling energy of a Josephson junction and the full
potential energy of the SQUID shown in Fig. 4c.
Chapter 6 Page 37 of 38
Essential Graduate Physics EM: Classical Electrodynamics
6.30. As was discussed in Sec. 7, the displacement current concept allows one to extend the
Ampère law to time-dependent processes as
C H dr I S t S Dn d r .
2
We also have seen that this generalization makes the integral H dr over an external contour, such as
the one shown in Fig. 10, independent of the choice of the surface S limited by the
C
contour. However, it may look like the situation is different for a contour drawn
I I
inside a capacitor – see the figure on the right. Indeed, if the contour’s size is
much larger than the capacitor’s thickness, the magnetic field H created by the
linear current I on the contour’s line is virtually the same as that of a continuous S
wire, and hence the integral H dr along the contour apparently does not depend
on its area, while the magnetic flux Dnd2r does, so the equation displayed above
seems invalid. (The current IS piercing this contour evidently equals zero.) Resolve this paradox, for
simplicity considering an axially-symmetric system.
6.31. A straight, uniform, long wire with a circular cross-section of radius R, made of an Ohmic
conductor with conductivity , carries dc current I. Calculate the flux of the Poynting vector through its
surface, and compare it with the Joule rate of energy dissipation.
Chapter 6 Page 38 of 38
Essential Graduate Physics EM: Classical Electrodynamics
D E, B H . (7.1)
Moreover, let us assume for a while that these constitutive equations hold for all frequencies of interest.
(Of course, these relations are exactly valid for the very important particular case of free space, where
we may formally use the macroscopic Maxwell equations (6.100), but with = 0 and = 0.) As was
already shown in Sec. 6.8, in this case, the Lorenz gauge condition (6.117) allows the Maxwell
equations to be recast into the wave equations (6.118) for the scalar and vector potentials. However, for
most purposes, it is more convenient to use the homogeneous Maxwell equations (6.100) for the electric
and magnetic fields – which are independent of the gauge choice. After an elementary elimination of D
and B using Eqs. (1),1 these equations take a simple, very symmetric form:
Maxwell H E
equations E 0, H 0, (7.2a)
for uniform t t
linear
media E 0, H 0. (7.2b)
Now, let us act by the operator on each of Eqs. (2a), i.e. take their curl, and then use the vector
algebra identity (5.31). The appearing terms E and H vanish due to Eqs. (2b), so the first terms of
Eqs. (2a) turn into the Laplace operators of these vectors (with the minus sign). Now swapping, in the
second terms, the operators /t and , and using Eqs. (2a) again, we get fully similar wave equations
for the electric and magnetic fields:2
EM wave 2 1 2 2 1 2
equations 2 2 E 0, 2 2 H 0, (7.3)
v t v t
1 Though in a medium, B rather than H is the actual macroscopic magnetic field, mathematically it is a bit more
convenient (just as it was in Sec. 6.2) to use the vector pair {E, H} in the following discussion, because at sharp
media boundaries, it is H that obeys the boundary condition (5.117) similar to that for E – cf. Eq. (3.37).
2 The two vector equations (3) are of course just a shorthand for six similar equations for three Cartesian
components of E and H.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
where z is the Cartesian coordinate along a certain (arbitrary) direction n, and f is an arbitrary function
of one argument. Note that this solution, first of all, describes a traveling wave – meaning a certain field
pattern moving, without deformation, along the z-axis, with the constant velocity v. Second, according
to Eq. (5), both E and H have the same values at all points of each plane perpendicular to the direction n
nz of the wave propagation; hence the second name – plane wave.
According to Eqs. (2), the independence of the wave equations (3) for vectors E and H does not
mean that their plane-wave solutions are independent. Indeed, plugging any solution of the type (5) into
Eqs. (2a), we get
nE Field vector
H , i.e. E Z H n , (7.6) relation
Z
where
1/ 2
E Wave
Z . (7.7) impedance
H
The vector relationship (6) means, first of all, that at any point of space and at any time instant,
the vectors E and H are perpendicular not only to the propagation vector n (such waves are called
transverse) but also to each other –– see Fig. 1.
Second, this equality does not depend on the function f, meaning that the electric and magnetic
fields increase and decrease simultaneously. Finally, the field magnitudes are related by the constant Z
called the wave impedance of the medium. Very soon we will see that this impedance plays a pivotal
role in many problems, in particular at the wave reflection from the interface between two media. Since
the dimensionality of E, in SI units, is V/m, and that of H is A/m, Eq. (7) shows that Z has the
dimensionality of V/A, i.e. ohms ().3 In particular, in free space,
3 In the Gaussian units, E and H have a similar dimensionality (in particular, in a free-space wave, E = H), making
the (very useful) notion of the wave impedance less manifestly exposed – so in some older physics textbooks it is
not mentioned at all!
Chapter 7 Page 2 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Wave 1/ 2
impedance
of free Z Z 0 0 4 10 7 c 377 Ω . (7.8)
space 0
Next, plugging Eq. (6) into Eqs. (6.113) and (6.114), we get:
Wave’s
energy u E 2 H 2 , (7.9a)
Wave’s E2
power S E H n nZH 2 , (7.9b)
Z
so, according to Eqs. (4) and (7), the wave’s energy and power densities are universally related as
S nuv . (7.9c)
In view of the Poynting vector paradox discussed in Sec. 6.8 (see Fig. 6.11), one may wonder
whether the last equality may be interpreted as the actual density of power flow. In contrast to the static
situation shown in Fig. 6.11, which limits the electric and magnetic fields to the vicinity of their sources,
waves may travel far from them. As a result, they can form wave packets of a finite length in free space
– see Fig. 2.
Let us apply the Poynting theorem (6.111) to the cylinder shown with dashed lines in Fig. 2, with
one lid inside the wave packet, and another lid in the region already passed by the wave. Then,
according to Eq. (6.111), the rate of change of the full field energy E inside the volume is dE /dt = –SA
(where A is the lid area), so S may be indeed interpreted as the power flow (per unit area) from the
volume. Making a reasonable assumption that the finite length of a sufficiently long wave packet does
not affect the physics inside it, we may indeed interpret the S given by Eqs. (9b-c) as the power flow
density inside a plane electromagnetic wave.
As we will see later in this chapter, the free-space value Z0 of the wave impedance, given by Eq.
(8), establishes the scale of Z of virtually all wave transmission lines, so we may use it, together with
Eq. (9), to get a better feeling of how much different are the electric and magnetic field amplitudes in the
waves – on the scale of typical electrostatics and magnetostatics experiments. For example, according to
Eqs. (9), a wave of a modest intensity S = 1 W/m2 (this is what we get from a usual electric bulb a few
meters away from it) has E ~ (SZ0)1/2 ~ 20 V/m, quite comparable with the dc field created by a standard
AA battery right outside it. On the other hand, the wave’s magnetic field H = (S/Z0)1/2 0.05 A/m. For
this particular case, the relation following from Eqs. (1), (4), and (7),
Chapter 7 Page 3 of 70
Essential Graduate Physics EM: Classical Electrodynamics
E E E
B H E ,
1/ 2
(7.10)
Z / 1/ 2
v
gives B = 0H = E/c ~ 710-8T, i.e. a magnetic field a thousand times lower than the Earth’s field, and
about 7 orders of magnitude lower than the field of a typical permanent magnet. This huge difference
may be interpreted as follows: the scale B ~ E/c of magnetic fields in the waves is “normal” for
electromagnetism, while the permanent magnet fields are abnormally high because they are due to the
ferromagnetic alignment of electron spins, essentially relativistic objects – see the discussion in Sec. 5.5.
The fact that Eq. (5) is valid for an arbitrary function f means, in the standard terminology, that a
medium with frequency-independent and supports the propagation of plane waves without either
decay (attenuation) or waveform deformation (dispersion). However, for any real medium but pure
vacuum, this approximation is valid only within limited frequency intervals. We will discuss the effects
of attenuation and dispersion in the next section and will see that all our prior formulas remain valid
even for an arbitrary linear media, provided that we limit them to single-frequency (i.e. sinusoidal,
frequently called monochromatic) waves. Such waves may be most conveniently represented as4
f Re f e i kz t , (7.11)
Mono-
chromatic
wave
where f is the complex amplitude of the wave, and k is its wave number (the magnitude of the wave
vector k nk), sometimes called the spatial frequency. The last term is justified by the fact, evident
from Eq. (11), that k is related to the wavelength exactly as the usual (“temporal”) frequency is
related to the time period T:
2 2 Spatial and
k , . (7.12) temporal
T frequencies
In the dispersion-free case (5), the compatibility of that relation with Eq. (11) requires the argument (kz
– t) k[z – (/k)t] to be proportional to (z – vt), so /k = v, i.e.
,
1/ 2 Dispersion
k (7.13) relation
v
so in that particular case, the dispersion relation (k) is linear.
Now note that Eq. (6) does not mean that the vectors E and H retain their direction in space.
(The wave in which they do, is called linearly polarized.5) Indeed, nothing in the Maxwell equations
prevents, for example, a joint rotation of this vector pair around the fixed vector n, while still keeping all
these three vectors perpendicular to each other at any instant – see Fig. 1. However, an arbitrary rotation
law or even an arbitrary constant frequency of such rotation would violate the single-frequency
(monochromatic) character of the elementary sinusoidal wave (11). To understand what is the most
general type of polarization the wave may have without violating that condition, let us represent two
4 As we have already seen in the previous chapter (see also CM Sec. 5.1), such complex-exponential
representation of sinusoidally changing variables is more convenient for mathematical manipulation than by using
sine and cosine functions, especially because in all linear relations, the operator Re may be omitted (implied) until
the very end of the calculation. Note, however, that this is not valid for the quadratic forms such as Eqs. (9).
5 The possibility of different polarizations of electromagnetic waves was discovered (for light) in 1699 by Rasmus
Bartholin, a.k.a. Erasmus Bartholinus.
Chapter 7 Page 4 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Cartesian components of one of these vectors (say, E) along any two fixed axes x and y, perpendicular to
each other and the z-axis (i.e. to the vector n), in the same form as used in Eq. (11):
E x Re E x e i kz t ,
E y Re E y e i kz t . (7.14)
To keep the wave monochromatic, the complex amplitudes E x and E y have to be constant in time;
however, they may have different magnitudes and an arbitrary phase shift between them.
In the simplest case when the arguments of these complex amplitudes are equal,
E x , y E x , y e i . (7.15)
so their ratio is constant in time – see Fig. 3a. This means that the wave is linearly polarized, with the
polarization plane defined by the relation
tan Ey / Ex . (7.17)
Fig. 7.3. Time evolution of the instantaneous electric field vector in monochromatic waves with:
(a) a linear polarization, (b) a circular polarization, and (c) an elliptical polarization.
Another simple case is when the moduli of the complex amplitudes E x and E y are equal, but
their phases are shifted by +/2 or –/2:
i / 2
E x E e i , E y E e . (7.18)
In this case
E x E coskz t , E y E cos kz t E sin kz t . (7.19)
2
This means that on the wave’s plane (normal to n), the end of the vector E moves, with the wave’s
frequency , either clockwise or counterclockwise around a circle – see Fig. 3b:
(t ) (t ) . (7.20)
Chapter 7 Page 5 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Such waves are called circularly polarized. In the dominant convention, the wave is called right-
polarized (RP) if it is described by the lower sign in Eqs. (18)-(20), i.e. if the vector of the angular
frequency of the field vector’s rotation coincides with the wave propagation’s direction n, and left-
polarized (LP) in the opposite case. These particular solutions of the Maxwell equations are very
convenient for quantum electrodynamics, because single electromagnetic field quanta with a certain
(positive or negative) spin direction may be considered elementary excitations of the corresponding
circularly polarized wave.6 (This fact does not exclude, from the quantization scheme, waves of other
polarizations, because any monochromatic wave may be presented as a linear combination of two
opposite circularly polarized waves – just as Eqs. (14) represent it as a linear combination of two
linearly polarized waves.)
Finally, in the general case of arbitrary complex amplitudes E x and E y, the field vector’s end
moves along an ellipse (Fig. 3c); such a wave is called elliptically polarized. The elongation
(“eccentricity”) and orientation of the ellipse are completely described by one complex number, the ratio
Ex/Ey, i.e. by two real numbers – for example, E x/E y and = arg(E x/E y).7
6 This issue is closely related to that of the wave’s angular momentum; it will be more convenient for me to
discuss it later in this chapter (in Sec. 7).
7 Note that the same information may be expressed via four so-called Stokes parameters s , s , s , and s , which
0 1 2 3
are popular in practical optics, because they may be used for the description of not only completely coherent
waves that were discussed here but also of party coherent or even fully incoherent waves – including the natural
light emitted by thermal sources such as our Sun. (In contrast to the coherent waves (14), whose complex
amplitudes are deterministic numbers, the amplitudes of incoherent waves should be treated as random variables.)
For more on the Stokes parameters, as well as many other optics topics I will not have time to cover, I can
recommend the classical text by M. Born et al., Principles of Optics, 7th ed., Cambridge U. Press, 1999.
8 In an isotropic media, the vectors E, P, and hence D = E + P, are all parallel, and for notation simplicity, I will
0
drop the vector sign in the following formulas of this section. I am also assuming that P at any point r is only
dependent on the electric field at the same point, and hence drop the factor exp{ikz}, the same for all variables.
This last assumption is valid if the wavelength is much larger than the elementary dipole’s size a. In most
systems of interest, the scale of a is atomic (~10-10m), so this approximation is valid up to extremely high
frequencies, ~ c/a ~ 1018 s-1, corresponding to hard X-rays.
Chapter 7 Page 6 of 70
Essential Graduate Physics EM: Classical Electrodynamics
The condition t’ t, which is implied by this relation, expresses a keystone principle of all
science, the causal relation between a cause (in our case, the electric field E(t’) applied to each dipole)
and its effect (the polarization P(t) it creates). The function G(t, t’) is called the temporal Green’s
function for the electric polarization.9 To reveal its physical sense, let us consider the case when the
applied field E(t) is a very short pulse at the moment t0 < t, which may be well approximated with
Dirac’s delta function:
E (t ) (t t" ) . (7.22)
Then Eq. (21) yields just P(t) = G(t, t”), so the Green’s function G(t, t’) is just the polarization at
moment t, created by a unit -functional pulse of the applied field at moment t’ (Fig. 4).
E (t ) E (t ) (t t' )
P(t )
P(t ) G (t , t' ) Fig. 7.4. An example of the temporal
0 Green’s function for the electric
t' t polarization (schematically).
What are the general properties of the temporal Green’s function? First, for systems without
infinite internal “memory”, G should tend to zero at t – t’ , although the type of this approach (e.g.,
whether the function G oscillates approaching zero, as in Fig. 4, or not) depends on the medium’s
properties. Second, if the parameters of the medium do not change in time, the polarization response to
an electric field pulse should be dependent not on its absolute timing, but only on the time difference
t – t’ between the pulse and observation instants, when Eq. (21) is reduced to
t
P (t )
E (t' )G(t t' )dt' E (t )G( )d .
0
(7.23)
-it
For a sinusoidal waveform, E(t) = Re [Ee ], this equation yields
i (t )
P (t ) Re E e G ( )d Re E G ( )e i d e it . (7.24)
0 0
The expression in the last parentheses is of course nothing else than the complex amplitude P of the
polarization. This means that though even if the static linear relation (3.43), P = e0E, is invalid for an
arbitrary time-dependent process, we may still keep its Fourier analog,
1
P e 0 E , with e G( )e
i
d , (7.25)
0 0
for each sinusoidal component of the process, using it as the definition of the frequency-dependent
electric susceptibility e(). Similarly, the frequency-dependent electric permittivity may be defined
using the Fourier analog of Eq. (3.46):
9The idea of these functions is very similar to that of the spatial Green’s functions (see Sec. 2.10), but with a new
twist, due to the causality principle. A discussion of the temporal Green’s functions in application to classical
mechanics (which to some extent overlaps with our current discussion) may be found in CM Sec. 5.1.
Chapter 7 Page 7 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Complex
D E . (7.26a) electric
permittivity
Then, according to Eq. (3.47), the complex permittivity is related to the temporal Green’s function by
the usual Fourier transform:
P
( ) 0 0 G ( )e i d . (7.26b)
E 0
and that its real part ’() is always an even function of frequency, while the imaginary part ”() is an
odd function of . Note that though the particular causal relationship (21) between P(t) and E(t) is
conditioned by the elementary dipole independence, the frequency-dependent complex electric
permittivity () may be introduced, in a similar way, if any two linear combinations of these variables
are related by a similar formula.
Absolutely similar arguments show that magnetic properties of a linear, isotropic medium may
be characterized by a frequency-dependent, complex permeability (). Now rewriting Eqs. (1) for the
complex amplitudes of the fields at a particular frequency, we may readily repeat all calculations of Sec.
1, and verify that all its results are valid for monochromatic waves even for a dispersive (but necessarily
linear!) medium. In particular, Eqs. (7) and (13) now become
1/ 2
( ) Complex
Z ( ) k ( ) ( ) ( ) ,
1/ 2
, (7.28) Z and k
( )
so the wave impedance and the wave number may be both complex functions of frequency.10
This fact has important consequences for electromagnetic wave propagation. First, plugging the
representation of the complex wave number as the sum of its real and imaginary parts, k() k’() +
ik”(), into Eq. (11):
f Re f e
i [ k ( ) z t ]
e k" ( ) z
Re f e
i [ k' ( ) z t ]
, (7.29)
we see that k”() describes the rate of wave attenuation in the medium at frequency .11 Second, if the
waveform is not sinusoidal (and hence should be represented as a sum of several/many sinusoidal
components), the frequency dependence of k’() provides for wave dispersion, i.e. the waveform
deformation at the propagation, because the propagation velocity (4) of component waves is now
different.12
10 The first unambiguous observations of dispersion (for the case of light refraction) were described by Sir Isaac
Newton in his Optics (1704) – even though this genius has never recognized the wave nature of light!
11 It may be tempting to attribute this effect to wave absorption, i.e. the dissipation of the wave’s energy, but we
will see very soon that wave attenuation may be due to different effects as well.
12 The reader is probably familiar with the most noticeable effect of the dispersion: the difference between the
group velocity vgr d/dk’ giving the speed of the envelope of a wave packet with a narrow frequency spectrum,
and the phase velocity vph /k’ of the component waves. The second-order dispersion effect, proportional to
d2/d2k’, leads to the deformation (gradual broadening) of the envelope itself. Following tradition, these effects
Chapter 7 Page 8 of 70
Essential Graduate Physics EM: Classical Electrodynamics
As an example of such a dispersive medium, let us consider a simple but very representative
Lorentz oscillator model.13 In dilute atomic or molecular systems (e.g., gases), electrons respond to the
external electric field especially strongly when its frequency is close to certain frequencies j
corresponding to the spectrum of quantum interstate transitions of a single atom/molecule. A
phenomenological description of this behavior may be obtained from a classical model of several
externally driven harmonic oscillators, generally with non-zero damping. For a single oscillator, driven
by the electric field’s force F(t) = qE(t), we can write the 2nd Newton law as
m x 2 0 x 02 x qE (t ) , (7.30)
where 0 is the own frequency of the oscillator, and 0 is its damping coefficient. For the electric field
of a monochromatic wave, E(t) = Re [Eexp{–it}], we may look for a particular, forced-oscillation
solution of this equation in a similar form x(t) = Re [xexp{–it}].14 Plugging this solution into Eq. (30),
we readily find the complex amplitude of these oscillations:
q E
x . (7.31)
m 0 2 2i 0
2
Using this result to calculate the complex amplitude of the dipole moment as p = qx, and then the
electric polarization P = np of a dilute medium with n independent oscillators for unit volume, for its
frequency-dependent permittivity (26) we get
Lorentz q2 1
( ) 0 n . (7.32)
oscillator
model
m j 2i 0
2 2
This result may be readily generalized to the case when the system has several types of
oscillators with different masses and frequencies:
fj
( ) 0 nq 2
j
m j 2j 2 2i j , (7.33)
where fj nj/n is the fraction of oscillators with frequency j, so the sum of all fj equals 1. Figure 5
shows a typical behavior of the real and imaginary parts of the complex dielectric constant, described by
Eq. (33), as functions of frequency. The oscillator resonances’ effect is clearly visible, and dominates
the media response at j, especially in the case of low damping, j << j. Note that in the low-
damping limit, the imaginary part of the dielectric constant ”, and hence the wave attenuation k”, are
negligibly small at all frequencies besides small vicinities of j, where the derivative d’()/d is
are discussed in more detail in the quantum-mechanics part of this series (QM Sec. 2.2), because they are a crucial
factor of Schrödinger’s wave mechanics. (See also a brief discussion in CM Sec. 6.3.)
13 This example is focused on the frequency dependence of rather than , because electromagnetic waves
interact with “usual” media via their electric field much more than via the magnetic field. Indeed, according to Eq.
(7), the magnetic field of the wave is of the order of E/c, so the magnetic component of the Lorentz force (5.10),
acting on a non-relativistic particle, Fm ~ quB ~ (u/c)qE, is much smaller than that of its electric component, Fe =
qE, and may be neglected. However, as will be discussed in Sec. 6, forgetting about the possible dispersion of
() may result in missing some remarkable opportunities for manipulating the waves.
14 If this point and Eq. (30) are not absolutely clear, please see CM Sec. 5.1 for a more detailed discussion.
Chapter 7 Page 9 of 70
Essential Graduate Physics EM: Classical Electrodynamics
negative.15 Thus, for a system of weakly-damped oscillators, Eq. (33) may be well approximated by a
sum of singularities (“poles”):
q2 fj
( ) 0 n
2
m
j
, for j j j j ' . (7.34)
j j j
()
'
(0)
" 0
0 Fig. 7.5. Typical frequency
dependence of the real and imaginary
parts of the complex electric
1 2 3 permittivity, according to the
generalized Lorentz oscillator model.
This result is especially important because according to quantum mechanics,16 Eq. (34) (with all
mj equal) is also valid for a set of non-interacting, similar quantum systems (whose dynamics may be
completely different from that of a harmonic oscillator!), provided that j are replaced with frequencies
of possible quantum interstate transitions, and coefficients fj are replaced with the so-called oscillator
strengths of the transitions – which obey the same sum rule, j fj = 1.
At 0, the imaginary part of the complex permittivity (33) also vanishes (for any j), while
its real part approaches its electrostatic (“dc”) value
nj
( 0) 0 q 2 . (7.35)
j m j j
2
Note that according to Eq. (30), the denominator of the fraction in Eq. (35) is just the effective spring
constant j = mjj2 of the jth oscillator, so the oscillator masses mj as such are actually (and quite
naturally) not involved in the static dielectric response.
In the opposite limit of very high frequencies, >> j, j, the permittivity also becomes real and
may be represented as
p2 q2 nj
( ) 0 1 2
,
where p2
0
m . (7.36) () in
plasma
j j
Chapter 7 Page 10 of 70
Essential Graduate Physics EM: Classical Electrodynamics
This result is very important because it is also valid at all frequencies if all j and j vanish, for example
for gases of free charged particles, in particular for plasmas – ionized atomic gases, provided that the
ion collision effects are negligible. (This is why the parameter p defined by Eq. (36) is called the
plasma frequency.) Typically, the plasma as a whole is neutral, i.e. the density n of positive atomic ions
is equal to that of the free electrons. Since the ratio nj/mj for electrons is much higher than that for ions,
the general formula (36) for the plasma frequency is usually well approximated by the following simple
expression:
ne 2
p2 . (7.37)
0 me
This expression has a simple physical sense: the effective spring constant ef mep2 = ne2/0
describes the Coulomb force that appears when the electron subsystem of the plasma is shifted, as a
whole, from its positive-ion subsystem, thus violating the electroneutrality.17 Hence, there is no surprise
that the function () given by Eq. (36) vanishes at = p: at this resonance frequency, the polarization
electric field E may oscillate, i.e. have a non-zero amplitude E = D/(), even in the absence of
external forces induced by external (stand-alone) charges, i.e. in the absence of the field D these charges
induce – see Eq. (3.32).
The behavior of electromagnetic waves in a medium that obeys Eq. (36), is very remarkable. If
the wave frequency is above p, the dielectric constant () and hence the wave number (28) are
positive and real, and waves propagate without attenuation, following the dispersion relation,
Plasma
dispersion
relation
k ( ) ( ) 0
1/ 2
c
1 2
p2
1/ 2
, (7.38)
whose plot is shown in Fig. 6.
2
1
p
1
ck
Fig. 7.6. The plasma dispersion law (solid
line) in comparison with the linear dispersion
0 in the free space (dashed line).
0 1 2 3
k /( p / c )
At p the wave number k tends to zero. Beyond that point (i.e. at < p), we still can use
Eq. (38), but it is instrumental to rewrite it in the mathematically equivalent form
17 Indeed, let us consider such a small shift x, perpendicular to the plane surface of a broad, plane slab filled with
plasma. The uncompensated ion charges, with equal and opposite surface densities = enx, that appear at the
slab surfaces, create inside it, according to Eq. (2.3), a uniform electric field with Ex = enx/0. This field exerts
the force –eEx = –(ne2/0)x –efx on each electron, pulling it back to its equilibrium position.
Chapter 7 Page 11 of 70
Essential Graduate Physics EM: Classical Electrodynamics
k ( )
i 2
p 2
1/ 2
i
, where
c
. (7.39)
c 2
p 2
1/ 2
At < p, the so-defined parameter is real, and Eq. (29) shows that the electromagnetic field
exponentially decreases with distance:
12 E e it c.c.,
E (t ) Re E e it
H (t ) Re H e it 1 E it
2 Z ( )
e
c.c. . (7.41)
Now, a straightforward calculation yields18
2 1/ 2
E E* 1 1 E E* 1 E ( )
S E t H t Z ( ) * 2 Re Z ( ) 2 Re ( ) . (7.42)
4 Z ( )
Let us apply this important general formula to our simple model of plasma at < p. In this
case, the magnetic permeability equals μ0, i.e. μ() = μ0 is positive and real, while () is real and
negative, so 1/Z() = [()/ μ()]1/2 is purely imaginary, and the average Poynting vector (42) vanishes.
This means that the energy, on average, does not flow along the z-axis. So, the waves with < p are
not absorbed in plasma. (Indeed, the Lorentz model with j = 0 does not describe any energy dissipation
mechanism.) Instead, as we will see in the next section, the waves are rather reflected from the plasma’s
boundary, more exactly from its surface layer of a thickness ~.
Note also that in the limit << p, Eq. (39) yields
1/ 2 1/ 2
c c 2 0 me me
. (7.43)
p ne 2 0 ne
2
But this is just a particular case (for q = e, m = me, and = 0) of Eq. (6.44) that was derived in Sec. 6.4
for the depth of the magnetic field’s penetration into a lossless (collision-free) conductor in the
quasistatic approximation. This fact shows again that, as was already discussed in Sec. 6.7, this
18 For an arbitrary plane wave, the total average power flow may be calculated as an integral of Eq. (42) over all
frequencies. By the way, combining this integral and the Poynting theorem (6.111), is it straightforward to prove
the following interesting expression for the average electromagnetic energy density of a narrow ( << ) wave
packet propagating in an arbitrary dispersive (but linear and isotropic) medium:
1 d ' d '
u
2 packet d
E E*
d
H H * d .
Chapter 7 Page 12 of 70
Essential Graduate Physics EM: Classical Electrodynamics
approximation (in which the displacement currents are neglected) gives an adequate description of the
time-dependent phenomena at << p, i.e. at << c/ = 1/k = /2.19
There are two most important examples of natural plasmas. For the Earth’s ionosphere, i.e. the
upper part of its atmosphere, which is almost completely ionized by the ultra-violet and X-ray
components of the Sun’s radiation, the maximum value of n, reached about 300 km over the Earth’s
surface, is between 1010 and 1012 m-3 (depending on the time of the day and the Sun’s activity phase), so
the maximum plasma frequency (37) is between 1 and 10 MHz. This is much higher than the particles’
typical reciprocal collision time -1, so Eq. (38) gives a good description of wave dispersion in this
plasma. The effect of reflection of electromagnetic waves with < p from the ionosphere enables long-
range (over-the-globe) radio communications and broadcasting at the so-called short waves, with cyclic
frequencies of the order of 10 MHz:20 they may propagate in the flat channel formed by the Earth’s
surface and the ionosphere, being reflected repeatedly by these parallel “walls”. Unfortunately, due to
the random variations of the Sun’s activity, and hence of p, this natural radio communication channel is
not too reliable, and in our age of transworld optical-fiber cables (see Sec. 7 below), its practical
importance has diminished.
Another important example of plasmas is free electrons in metals and other conductors. For a
typical metal, n is of the order of 1023 cm-3 1029 m-3, so Eq. (37) yields p ~ 1016 s-1. This value of p
is somewhat higher than the mid-optical frequencies ( ~ 31015 s-1), explaining why planar, clean
metallic surfaces, such as the aluminum and silver films used in mirrors, are so shiny: at these
frequencies, their complex permittivity () is almost exactly real and negative, leading to light
reflection, with very little absorption.
The simple model (36), which neglects electron scattering, becomes inadequate at lower
frequencies, ~ 1. A good phenomenological way of extending the model to the account of scattering
is to take, in Eq. (33), the lowest frequency j equal zero (to describe the free electrons), while keeping
the damping coefficient 0 of this mode larger than zero, to account for the energy dissipation due to
their scattering. Then Eq. (33) is reduced to
n0 q 2 1 n0 q 2 1
ef ( ) opt ( ) opt ( ) i , (7.44)
m 2i 0
2
2 0 m 1 i / 2 0
where the response opt() at high (in practice, optical) frequencies is still given by Eq. (33), but now
with j > 0. The result (44) allows for a simple interpretation. To show that, let us incorporate into our
calculations the Ohmic conduction of the medium, generalizing Eq. (4.7) as j = ()E to account for
the possible frequency dependence of the Ohmic conductivity. Plugging this relation into the Fourier
image of the relevant macroscopic Maxwell equation, H = j – iD j – i()E, we get
19 One more convenience of the simple model of a collision-free plasma, which has led us to Eq. (36), is that it
may be readily generalized to the case of an additional strong dc magnetic field B0 (much higher than that of the
wave) applied in the direction n of wave propagation. It is straightforward (and hence left for the reader) to show
that such plasma exhibits the Faraday effect of the polarization plane’s rotation, and hence gives an example of an
anisotropic media that violates the Lorentz reciprocity relation (6.121).
20 These frequencies are an order of magnitude lower than those used for TV and FM-radio broadcasting.
Chapter 7 Page 13 of 70
Essential Graduate Physics EM: Classical Electrodynamics
This relation shows that for a monochromatic wave, the addition of the Ohmic current density j to the
displacement current density is equivalent to the addition of () to –i(), i.e. to the following
change of the ac electric permittivity:21
ef ( ) opt i . (7.46)
Now the comparison of Eqs. (44) and (46) shows that they coincide if we take
n0 q 2 1 1 Generalized
( ) (0) , (7.47) Drude
m0 1 i 1 i formula
where the dc conductivity (0) is described by the Drude formula (4.13), and the phenomenologically
introduced coefficient 0 is associated with 1/2. Eq. (47), which is frequently called the generalized (or
“ac”, or “rf”) Drude formula,22 gives a very reasonable (semi-quantitative) description of the ac
conductivity of many metals almost up to optical frequencies.
Now returning to our discussion of the generalized Lorentz model (33), we see that the
frequency dependences of the real (’) and imaginary (”) parts of the complex permittivity it yields are
not quite independent. For example, let us have one more look at the resonance peaks in Fig. 5. Each
time the real part drops with frequency, d’/d < 0, its imaginary part ” has a positive peak. Ralph
Kronig (in 1926) and Hendrik (“Hans”) Kramers (in 1927) independently showed that this is not an
occasional coincidence pertinent only to this particular model. Moreover, the full knowledge of the
function ’() enables the calculation of the function ”(), and vice versa. The mathematical reason
for this fact is that both these functions are always related to a single real function G() – see Eqs. (27).
To derive these relations, let us consider Eq. (26b) on the complex frequency plane, ’
+ i”:
f ω (ω) 0 G ( )e iω
d G ( )e i' e " d . (7.48)
0 0
For all stable physical systems, G() has to be finite for all important values of the real integration
variable ( > 0), and tend to zero at 0 and . (Indeed, according to Eq. (23), a non-zero G(0)
would mean an instantaneous response of the medium to the external force, while G() 0 would mean
that it has an infinitely long memory.) Because of that, and thanks to the factor e–”, the expression
under the integral in Eq. (48) tends to zero at in all upper half-plane (” 0). As a result, we
may claim that the complex function f () given by this relation, is analytical in that half-plane. This fact
allows us to apply to it the general Cauchy integral formula23
1 dΩ
f (ω)
2i C
f (Ω)
Ωω
, (7.49)
21 Alternatively, according to Eq. (45), it is possible (and in the field of infrared spectroscopy, conventional) to
attribute the ac response of a medium, at all frequencies, to its effective complex conductivity: ef () () –
i() –ief ().
22 It may be also derived from the Boltzmann kinetic equation in the so-called relaxation-time approximation
(RTA) – see, e.g., SM Sec. 6.2.
23 See, e.g., MA Eq. (15.2).
Chapter 7 Page 14 of 70
Essential Graduate Physics EM: Classical Electrodynamics
where ’ + i” is also a complex variable. Let us take the integration contour C of the form shown
in Fig. 7, with the radius R of the larger semicircle tending to infinity, and the radius r of the smaller
semicircle (around the singular point = ) tending to zero. Due to the exponential decay of f() at
, the contribution to the right-hand side of Eq. (49) from the larger semicircle vanishes,24 while
the contribution from the small semicircle, where = + rexp{i}, with – 0, is
f ( ) ir exp{i}d f ( )
0 0
1 dΩ 1
lim r 0
2 i Ω r exp{i )
f ( Ω)
Ω
2 i r exp{i}
2 d f ( ). (7.50)
2
Ω"
Ω R
24Strictly speaking, this also requires f() to decrease faster than -1 at the real axis (at ” = 0), but due to the
inertia of charged particles, this requirement is fulfilled for all realistic models of dispersion – see, e.g., Eq. (36).
25 I am typesetting this symbol in a Roman (upright) font, to avoid any possibility of confusion with the medium’s
polarization.
Chapter 7 Page 15 of 70
Essential Graduate Physics EM: Classical Electrodynamics
2 Ω dΩ 2 dΩ
' ( ) 0 P " (Ω) , " ( ) P ' (Ω) 0 , (7.54)
0 Ω2 2 0 Ω 2
2
which is more convenient for most applications, because it involves only physical (positive) frequencies.
Though the Kramers-Kronig relations are “global” in frequency, in certain cases they allow an
approximate calculation of dispersion from experimental data for absorption, collected even within a
limited (“local”) frequency range. Most importantly, if a medium has a sharp absorption peak at some
frequency j, we may describe it as
" ( ) c ( j ) a more smooth function of , (7.55)
thus predicting the anomalous dispersion near such a point. This calculation shows that such behavior
observed in the Lorentz oscillator model (see Fig. 5) is by no means occasional or model-specific.
Let me emphasize again that the Kramers-Kronig relations (53)-(54) are much more general than
the Lorentz model (33), and require only a causal linear relation (21) between the polarization P(t) with
the electric field E(t’).26 Hence, these relations are also valid for the complex functions relating Fourier
images of any cause/effect-related pair of variables. In particular, at a measurement of any linear
response r(t) of any experimental sample to any external field f(t’), whatever the nature of this response
and physics behind it, we may be confident that there is a causal relationship between the variables r and
f, so the corresponding complex function () r/f does obey the Kramers-Kronig relations.
However, it is still important to remember that a linear relationship between the Fourier amplitudes of
two variables does not necessarily imply a causal relationship between them.27
7.3. Reflection
The most important new effect arising in nonuniform media is wave reflection. Let us start its
discussion from the simplest case of a plane electromagnetic wave that is normally incident on a sharp
interface between two uniform, linear, isotropic media.
Moreover, let us first consider an even simpler sub-case when one of the two media (say, that
located at z > 0, see Fig. 8) cannot sustain any electric field at all – as implied, in particular, by the
macroscopic model of a perfect conductor – see Eq. (2.1):
26 Actually, in mathematics, the relations even somewhat more general than Eqs. (53) and valid for an arbitrary
analytic function of complex argument (the Sokhotski-Plemelj theorem) have been known at least from 1868.
27 For example, the function () E / P , in the Lorentz oscillator model, does not obey the Kramers-Kronig
relations. This is evident not only physically, from the fact that E(t) is not a causal function of P(t), but even
mathematically. Indeed, Green’s function describing a causal relationship has to tend to zero at small time delays
t – t’, so its Fourier image has to tend to zero at . This is certainly true for the function f() given by
Eq. (32), but not for the reciprocal function () 1/f() (2 – 02) – 2i, which diverges at large
frequencies.
Chapter 7 Page 16 of 70
Essential Graduate Physics EM: Classical Electrodynamics
E z 0 0. (7.57)
This condition is evidently incompatible with the single traveling wave (5). However, this solution may
be readily generalized using the fact that the dispersion-free 1D wave equation,
2 1 2
2 2 2 E 0 , (7.58)
z v t
supports waves propagating, with the same speed, in any of two opposite directions. As a result, the
following linear superposition of two such waves,
E z 0 f ( z vt ) f ( z vt ) , (7.59)
satisfies both the equation and the boundary condition (57), for an arbitrary function f. The second term
on the right-hand side of Eq. (59) may be interpreted as a result of total reflection of the incident wave
(described by its first term) – in this particular case, with the change of the electric field’s sign. This
means, in particular, that within the macroscopic model, a conductor acts as a perfect mirror. By the
way, since the vector n of the reflected wave is opposite to that of the incident one (see the arrows in
Fig. 8), Eq. (6) shows that the magnetic field of the wave does not change its sign at the reflection:
1
H z 0 f ( z vt ) f ( z vt ) . (7.60)
Z
n incident
0 z
n reflected Fig. 7.8. A snapshot of the electric field at the
reflection of a sinusoidal wave from a perfect
conductor: a realistic pattern (red lines) and its
macroscopic, ideal-mirror approximation (blue
lines). Dashed lines show the snapshots after a
half-period time delay (t = ).
The blue lines in Fig. 8 show the resulting pattern (59) for the simplest, monochromatic wave:
Wave’s
total
reflection
E z 0 Re E e i (kz t ) E e i ( kz t ) . (7.61a)
Depending on convenience in a particular context, this pattern may be legitimately represented and
interpreted either as the linear superposition (61a) of two traveling waves or as a single standing wave:
i t / 2
E z 0
2 Im E e it sin kz 2 Re iE e it sin kz 2 Re E e sin kz , (7.61b)
in which the electric and magnetic field oscillate with the phase shifts by /2 both in time and space:
Chapter 7 Page 17 of 70
Essential Graduate Physics EM: Classical Electrodynamics
E E E
H Re e i ( kz t ) e i ( kz t ) 2 Re e it cos kz .
z 0 (7.62)
Z Z Z
As a result of this shift, the time average of the Poynting vector’s magnitude,
S ( z , t ) EH
1
Z
Re E2 e 2it sin 2kz , (7.63)
equals zero, showing that at the total reflection, there is no average power flow. (This is natural because
the perfect mirror can neither transmit the wave nor absorb it.) However, Eq. (63) shows that the
standing wave features local oscillations of energy, transferring it periodically between the
concentrations of the electric and magnetic fields, separated by the distance z = /2k = /4.
In the case of the sinusoidal waves, the reflection effects may be readily explored even for the
more general case of dispersive and/or lossy (but still linear) media in which () and (), and hence
the wave vector k() and the wave impedance Z(), defined by Eqs. (28), are certain complex functions
of frequency. The “only” new factor we have to account for is that in this case, the reflection may not be
total, so inside the second medium we have to use the traveling-wave solution as well. This factor may
be taken care of by looking for the solution to our boundary problem in the form
E z0
Re E e ik z R e ik z e i t , E z0 ik z
Re E Te e i t , (7.64)
Wave’s
partial
reflection
H
E
Re e ik z R e ik z e it ,
z 0 E ik z
H z 0 Re Te e it . (7.65)
Z ( ) Z ( )
(The indices + and – correspond to the media located at z > 0 and z < 0, respectively.) Please note the
following important features of Eqs. (64)-(65):
(i) They satisfy the Maxwell equations in both media. (Historically, the fact that at z > 0, these
solutions do not include any components proportional to exp{ik–z}, looked surprising and was called the
wave extinction paradox.)
(ii) Due to the problem’s linearity, we could (and did :-) take the complex amplitudes of the
reflected and transmitted wave proportional to that (E) of the incident wave, while scaling them with
dimensionless, generally complex coefficients R and T. As a comparison of Eqs. (64)-(65) with Eqs.
(61)-(62) shows, the total reflection from an ideal mirror corresponds to R = –1 and T = 0.
(iii) Since in our current problem, the incident wave arrives from one side only (from z = –),
there is no need to include a term proportional to exp{–ik+z} into Eqs. (64)-(65) – even though this term
is also a legitimate solution of our wave equation. However, we would need to add such a term if the
medium at z > 0 had been nonuniform (e.g., had at least one more interface or any other inhomogeneity),
because the wave reflected from that additional inhomogeneity would be incident on our interface
(located at z = 0) from the right.
(iv) Eqs. (64)-(65) may be used even for the description of the cases when waves cannot
propagate to z 0, for example, a conductor or a plasma with p > . Indeed, the exponential drop of
the field amplitude at z > 0 in such cases is automatically described by the imaginary part of the wave
number k+ – see Eq. (29).
Chapter 7 Page 18 of 70
Essential Graduate Physics EM: Classical Electrodynamics
In order to calculate the coefficients R and T, we need to use boundary conditions at z = 0. Since
in our current case of the normal incidence, the reflection does not change the transverse character of the
partial waves, both vectors E and H remain tangential to the interface plane (in our notation, z = 0).
Reviewing the arguments that have led us, in statics, to the boundary conditions (3.37) and (5.117) for
these components, we see that they remain valid for the time-dependent situation as well,28 so for our
current case of normal incidence, we may write:
E z 0 E z 0 , H z 0 H z 0 . (7.66)
Plugging Eqs. (64)-(65) into these conditions, we readily get two equations for the coefficients R and T:
1
1 R T, 1 R 1 T . (7.67)
Z Z
Solving this simple system of linear equations, we get29
Reflection
Z Z 2Z
and
transmission:
R , T . (7.68)
sharp interface
Z Z Z Z
These formulas are very important, and much more general than one might think because they
are applicable for virtually any 1D waves – electromagnetic or not, provided that the impedance Z is
defined properly.30 Since in the general case the wave impedances Z defined by Eq. (28) with the
corresponding indices, are complex functions of frequency, Eqs. (68) show that R and T may have
imaginary parts as well. This fact has important consequences at z < 0, where the reflected wave,
proportional to R, combines (“interferes”) with the incident wave. Indeed, with R = R ei (where
arg R is a real phase shift), the expression in the parentheses in the first of Eqs. (64) becomes
28 For example, the first of Eqs. (66) may be obtained by integrating the full (time-dependent) Maxwell equation
E + B/t = 0 over a narrow and long rectangular contour with dimensions l and d (d << l) stretched along the
interface. At the application of the Stokes theorem to this integral, the first term gives El, while the contribution
of the second term is proportional to the product ld, so its contribution at d/l 0 is negligible. The proof of the
second boundary condition is similar – as was already discussed in Sec. 6.2.
29 Please note that only the media impedances (rather than their wave velocities) are important for reflection in
this case! Unfortunately, this fact is not clearly emphasized in some textbooks that discuss only the case = 0,
when Z = (0/)1/2 and v = 1/(0)1/2 are proportional to each other.
30 See, e.g., the discussion of elastic waves of mechanical deformations in CM Secs. 6.3, 6.4, 7.7, and 7.8.
Chapter 7 Page 19 of 70
Essential Graduate Physics EM: Classical Electrodynamics
9). From the results of such a measurement, it is straightforward to find both R and -, and hence
restore the complex R, and then use Eq. (68) to calculate both the modulus and the argument of Z+.
(Before computers became ubiquitous, a specially lined paper called the Smith chart, had been
frequently used for performing this recalculation graphically; even nowadays, it is still used for result
presentation.)
V E 2 ( z, t )
z
Now let us discuss what these results give for waves incident from the free space (Z–() = Z0 =
const, k– = k0 = /c) onto the surfaces of two particular important media.
(i) For a collision-free plasma (with negligible magnetization) we may use Eq. (36) with () =
0, to represent the impedance (28) in either of two equivalent forms:
Z Z0 iZ 0 . (7.70)
2
p
2 1/ 2
2
p 2
1/ 2
The first of these forms is more convenient in the case > p, when the wave vector k+ and the wave
impedance Z+ of the plasma are real, so a part of the incident wave does propagate into it. Plugging this
expression into the latter of Eqs. (68), we see that T is real as well:
2
T . (7.71)
p2
2 1/ 2
Note that according to this formula, and somewhat counter-intuitively, T > 1 for any frequency (above
p), inviting the question: how can the transmitted wave be more intensive than the incident one that has
induced it? To answer this question, we need to compare the powers (rather than the electric field
amplitudes) of these two waves, i.e. their average Poynting vectors (42):
S incident
E
2
S
TE
2
E
2
4 2 p2 1/ 2
, . (7.72)
2
2Z 0 2Z 2Z 0 2
2 1/ 2
p
The ratio of these two values31 is always below 1 (and tends to zero at p), so only a fraction of the
incident wave power may be transmitted. Hence the result T > 1 may be interpreted as follows: an
interface between two media may be an impedance transformer: it can never transmit more power than
the incident wave provides, i.e. can only decrease the product S = EH, but since the ratio Z = E/H
changes at the interface, the amplitude of one of the fields may increase at the transmission.
Now let us proceed to case < p when the waves cannot propagate in the plasma. In this case,
the second of the expressions (70) is more convenient, because it immediately shows that Z+ is purely
31 Thisratio is sometimes also called the “wave transmission coefficient”, but to avoid its confusion with the T
defined by Eq. (64), it is better to call it the power transmission coefficient.
Chapter 7 Page 20 of 70
Essential Graduate Physics EM: Classical Electrodynamics
imaginary, while Z- = Z0 is purely real. This means that (Z+ – Z-) = (Z+ + Z-)*, i.e. according to the first of
Eqs. (68), R = 1, so the reflection is total, i.e. no incident power (on average) is transferred into the
plasma – as was already discussed in Sec. 2. However, the complex R has a finite argument,
arg R 2 arg( Z Z 0 ) 2 tan 1 , (7.73)
2
p 2
1/ 2
and hence provides a finite spatial shift (69) of the standing wave toward the plasma surface:
c
tan 1 . (7.74)
2k 0 2
p 2
1/ 2
On the other hand, we already know from Eq. (40) that the solution at z > 0 is exponential, with
the decay length described by Eq. (39). Calculating, from the coefficient T, the exact coefficient
before this exponent, it is straightforward to verify that the electric and magnetic fields are indeed
continuous at the interface, completing the pattern shown with red lines in Fig. 8. This wave penetration
into a fully reflecting material may be experimentally observed, for example, by thinning its sample.
Even without solving this problem exactly, it is evident that if the sample’s thickness d becomes
comparable to , a part of the exponential “tail” of the field reaches its second interface, and induces a
propagating wave. This is a classical-electromagnetic analog of the quantum-mechanical tunneling
through a potential barrier.32
Note that at low frequencies, both – and tend to the same frequency-independent value,
1/ 2 1/ 2
c c 2 0 me me
, , at 0, (7.75)
p ne 2
0 ne
2
p
which is just the field penetration depth (6.44) calculated for a perfect conductor model (assuming m =
me and = 0) in the quasistatic limit. This is natural, because the condition << p may be recast as 0
2c/ >> 2c/p 2, i.e. as the quasistatic approximation’s validity condition.
(ii) Now let us consider electromagnetic wave’s reflection from an Ohmic, non-magnetic
conductor. In the simplest low-frequency limit, when is much less than 1, the conductor may be
described by a frequency-independent conductivity . 33 According to Eq. (46), in this case we can take
1/ 2
0
Z . (7.76)
( ) i /
opt
With this substitution, Eqs. (68) immediately give us all the results of interest. In particular, in the most
important quasistatic limit when s (2/0)1/2 << 0 2c/, i.e. / >> 0 ~ opt, the conductor’
impedance is low:
1/ 2 1/ 2
2 s Z
Z 0 Z0 , i.e. 1. (7.77)
i i 0 Z0
Chapter 7 Page 21 of 70
Essential Graduate Physics EM: Classical Electrodynamics
This impedance is complex, and hence some fraction f of the incident wave is absorbed by the
conductor. The fraction may be found as the ratio of the dissipated power (either calculated, as was done
above, from Eqs. (68), or just taken from Eq. (6.36), with the magnetic field amplitude H = 2 E /Z0
– see Eq. (62)) to the incident wave’s power given by the first of Eqs. (72). The result,
2 s
f 4 s 1 . (7.78)
c 0
is used for crude estimates of the energy dissipation in metallic-wall waveguides and resonators. It
shows that to keep the energy losses low, the characteristic size of such systems (which gives a scale of
the free-space wavelengths 0 at which they are used) should be much larger than s. A more detailed
theory of these structures, and of the energy loss in them, will be discussed later in this chapter.
7.4. Refraction
Now let us consider the effects arising at a plane interface between two uniform media when the
wave’s incidence angle (Fig. 10) is arbitrary rather than equal to zero as in our previous analysis, for
the simplest case of fully transparent media, with real () and (). (For the sake of notation
simplicity, in most formulas below, the argument of these functions will be dropped, i.e. just implied.)
z
r k
, 0 k sin r
, k sin k ' sin ' x
In contrast with the case of normal incidence, here the wave vectors k–, k’–, and k+ of the three
components (incident, reflected, and transmitted) waves may have different directions. (Such change of
the transmitted wave’s direction is called refraction.) Hence let us start our analysis by writing a general
expression for a single plane, monochromatic wave for the case when its wave vector k has all three
Cartesian components, rather than one. An evident generalization of Eq. (11) for this case is
i k x k y y k z z ) t
f (r, t ) Re f e x
Re f e
i (k r t )
. (7.79)
This expression enables a ready analysis of “kinematic” relations, which are independent of the
media impedances. Indeed, it is sufficient to notice that to satisfy any linear, homogeneous boundary
conditions at the interface (z = 0), all partial plane waves must have the same temporal and spatial
dependence on this plane. Hence if we select the x-z plane so that the vector k– lies in it, then (k–)y = 0,
and k+ and k’– cannot have any y-component either, i.e. all three wave vectors lie in the same plane –
Chapter 7 Page 22 of 70
Essential Graduate Physics EM: Classical Electrodynamics
that is selected as the plane of the drawing in Fig. 10. Moreover, because of the same reason, their x-
components should be equal:
k sin k ' sin ' k sin r . (7.80)
From here we immediately get two well-known laws: of reflection
Reflection
angle ' , (7.81)
and of refraction:34
sin r k
Snell . (7.82)
law sin k
In this form, the laws are valid for plane waves of any nature. In optics, the Snell law (82) is frequently
represented in the form
sin r n
, (7.83)
sin n
where n is the index of refraction (also called the “refractive index”) of the corresponding medium,
defined as its wave number normalized to that of the free space (at the particular wave’s frequency):
1/ 2
k
Index
n . (7.84)
of refraction
k0 0 0
Perhaps the most famous corollary of the Snell law is that if a wave propagates from a medium
with a higher index of refraction to that with a lower one (i.e. if n- > n+ in Fig. 10), for example from
water to air, there is always a certain critical value c of the incidence angle,
1/ 2
n
Critical
c sin 1
sin 1 , (7.85)
angle
n
at which the refraction angle r (see Fig. 10 again) reaches /2. At a larger , i.e. within the range c <
< /2, the boundary conditions (80) cannot be satisfied by a refracted wave with a real wave vector, so
the wave experiences the so-called total internal reflection. This effect is very important for practice
because it means that dielectric surfaces may be used as optical mirrors, in particular in optical fibers –
to be discussed in more detail in Sec. 7 below. This is very fortunate for telecommunication technology
because light’s reflection from metals is rather imperfect. Indeed, according to Eq. (78), in the optical
range (0 ~ 0.5 m, i.e. ~ 1015 s-1), even the best conductors (with ~ 6108 S/m and hence the
normal skin depth s ~ 1.5 nm) provide power loss of at least a few percent at each reflection.
Note, however, that even within the range c < < /2, the field at z > 0 is not identically equal
to zero: it penetrates into the lower-n media by a distance of the order of 0, exponentially decaying
inside it, just as it does at the normal incidence – see Fig. 8. However, at 0 the penetrating field still
propagates, with the wave number (80), along the interface. Such a field, exponentially dropping in one
direction but still propagating as a wave in another direction, is commonly called the evanescent wave.
34 The latter relation is traditionally called the Snell law, after a 17th-century astronomer Willebrord Snellius, but
it has been traced all the way back to a circa 984 work by Abu Saad al-Ala ibn Sahl. (Claudius Ptolemy who
performed pioneering experiments on light refraction in the 2nd century AD, was just one step from this result.)
Chapter 7 Page 23 of 70
Essential Graduate Physics EM: Classical Electrodynamics
One more remark: just as at the normal incidence, the field’s penetration into another medium
causes a phase shift of the reflected wave – see, e.g., Eq. (69) and its discussion. A new feature of this
phase shift, arising at 0, is that it also has a component parallel to the interface – the so-called Goos-
Hänchen effect. In geometric optics, this effect leads to an image shift (relative to its position in a
perfect mirror) with components both normal and parallel to the interface.
Now let us carry out an analysis of “dynamic” relations that determine amplitudes of the
refracted and reflected waves. For this, we need to write explicitly the boundary conditions at the
interface (i.e. the plane z = 0). Since now the electric and/or magnetic fields may have components
normal to the plane, in addition to the continuity of their tangential components, which were repeatedly
discussed above,
E x , y z 0 E x , y z 0 , H x , y z 0 H x , y z 0 , (7.86)
we also need relations for the normal components. As it follows from the homogeneous macroscopic
Maxwell equations (6.99b), they are also the same as in statics, i.e. Dn = const, and Bn = const, for our
coordinate choice (Fig. 10) giving
Ez z 0 Ez z 0 , H z z 0 H z z 0 . (7.87)
The expressions of these components via the amplitudes E, RE, and TE of the incident,
reflected, and transmitted waves depend on the incident wave’s polarization. For example, for a linearly-
polarized wave with the electric field vector normal to the plane of incidence, i.e. parallel to the
interface plane, the reflected and refracted waves are similarly polarized – see Fig. 11a.
z (a) z (b)
H k
k
H
r r
E
, , E
, 0 x , 0 x
k- H'- k- E'-
H-
E E'- H- H- '⊙
'
- k - E- k '-
Fig. 7.11. Reflection and refraction at two different linear polarizations of the incident wave.
As a result, all Ez are equal to zero (so the first of Eqs. (87) is inconsequential), while the
tangential components of the electric field are equal to their full amplitudes, just as at the normal
incidence, so we still can use Eqs. (64) expressing these components via the coefficients R and T.
However, at 0 the magnetic fields have not only tangential components
E E
Hx z 0 Re (1 R) cos e it , Hx z 0 Re T cos r e it , (7.88)
Z Z
but also normal components (see Fig. 11a):
Chapter 7 Page 24 of 70
Essential Graduate Physics EM: Classical Electrodynamics
E E
Hz z 0 Re (1 R) sin e it , Hz z 0 Re T sin r e it . (7.89)
Z Z
Plugging these expressions into the boundary conditions expressed by Eqs. (86) (in this case, for
the y-components only) and the second of Eqs. (87), we get three equations for two unknown
coefficients R and T. However, two of these equations duplicate each other because of the Snell law, and
we get just two independent equations,
1
1 R T, 1 R cos 1 T cos r , (7.90)
Z Z
which are a very natural generalization of Eqs. (67), with the replacements Z- Z–cosr, Z+ Z+cos.
As a result, we can immediately use Eq. (68) to write the solution of the system (90):35
Z cos Z cos r 2 Z cos
R , T . (7.91a)
Z cos Z cos r Z cos Z cos r
If we want to express these coefficients via the angle of incidence alone, we should use the Snell
law (82) to eliminate the angle r, getting frequently quoted bulkier expressions:
R
Z cos Z 1 (k / k ) 2 sin 2 1/ 2
, T
2Z cos
. (7.91b)
Z cos Z 1 (k 2
/ k ) sin
2
1/ 2
Z cos Z 1 (k / k ) 2 sin 2
1/ 2
However, conceptually it is preferable to use the kinematic relation (82) and the dynamic relations (91a)
separately, because Eq. (91b) obscures the very important physical fact that the ratio of k, i.e. of the
wave velocities of the two media, is only involved in the Snell law, while Eqs. (91b) explicitly include
only the wave impedances – just as in the case of normal incidence.
In the opposite case of the linear polarization of the electric field within the plane of incidence
(Fig. 11b), it is the magnetic field that does not have a normal component, so it is now the second of
Eqs. (87) that does not participate in the solution. However, now the electric fields in the two media
have not only tangential components,
Ex z 0
Re E (1 R) cos e it , Ex z 0
Re E T cos r e it , (7.92)
As a result, instead of Eqs. (90), the reflection and transmission coefficients are related as
1
(1 R) cos T cos r , 1 R 1 T . (7.94)
Z Z
Again, the solution of this system may be immediately written using the analogy with Eq. (67):
35Note that we may calculate the reflection and transmission coefficients R’ and T’ for the wave traveling in the
opposite direction just by making the following parameter swaps: Z+ Z- and r, and that the resulting
coefficients satisfy the following Stokes relations: R’ = –R, and R2 + TT’ = 1, for any Z.
Chapter 7 Page 25 of 70
Essential Graduate Physics EM: Classical Electrodynamics
R
Z 1 (k / k ) 2 sin 2
1/ 2
Z cos
, T
2Z cos
. (7.95b)
Z 1 (k
2
/ k ) sin 2
1/ 2
Z cos Z 1 (k
2
/ k ) sin 2
1/ 2
Z cos
For the particular case + = – = 0, when Z+/Z– = (–/+)1/2 = k–/k+ = n–/n+ (which is
approximately correct for traditional optical media), Eqs. (91b) and (95b) are called the Fresnel
formulas.36 Most textbooks are quick to point out that there is a major difference between them: while
for the electric field polarization within the plane of incidence (Fig. 11b), the reflected wave’s amplitude
(proportional to the coefficient R) turns to zero37 at a special value of (called the Brewster angle):38
n
B tan 1 , (7.96)
n
while there is no such angle in the opposite case (shown in Fig. 11a). However, note that this statement,
as well as Eq. (96), is true only for the case + = –. In the general case of different and , Eqs. (91)
and (95) show that the reflected wave vanishes at = B with
/ , for E n z (Fig.11a),
tan 2 B (7.97) Brewster
/ , for H n z (Fig.11b). angle
Note the natural symmetry of these relations, resulting from the E H symmetry for
these two polarization cases (Fig. 11). These formulas also show that for any set of parameters of the
two media (with , > 0), tan2B is positive (and hence a real Brewster angle B exists) only for one of
these two polarizations. In particular, if the interface is due to the change of alone (i.e. if + = -), the
first of Eqs. (97) is reduced to the simple form (96) again, while for the polarization shown in Fig. 11b,
there is no Brewster angle, i.e. the reflected wave has a non-zero amplitude for any .
Such an account of both media parameters, and , on an equal footing is necessary to describe
several interesting effects. The first of them is the so-called negative refraction.39 As was shown in Sec.
36 Named after Augustin-Jean Fresnel (1788-1827), one of the wave optics pioneers, who is credited, among
many other contributions (see, in particular, discussions in Ch. 8), for the concept of light as a purely transverse
wave.
37 This effect is used in practice to obtain linearly polarized light, with the electric field vector perpendicular to
the plane of incidence, from the natural light with its random polarization. An even more widespread application
of the effect is a partial reduction of undesirable glare from wet pavement (for the water/air interface, n+/n– 1.33,
giving B 50) by covering glasses and car headlights with thin vertically-polarizing layers.
38 A very simple interpretation of Eq. (96) is based on the fact that, together with the Snell law (82), it gives r +
= /2. As a result, the vector E+ is parallel to the vector k’–, and hence the oscillating electric dipoles of the
medium do not have the Cartesian component that could induce the transverse electric field E’– of the potential
reflected wave.
39 Despite some important background theoretical work by A. Schuster (1904), L. Mandelstam (1945), D.
Sivikhin (1957), and especially V. Veselago (1966-67), the negative refractivity effects became a subject of
intensive scientific research and engineering development only in the 2000s.
Chapter 7 Page 26 of 70
Essential Graduate Physics EM: Classical Electrodynamics
2, in a medium with electric-field-driven resonances, the function () may be almost real and negative,
at least within limited frequency intervals – see, in particular, Eq. (34) and Fig. 5. As has already been
discussed, if, at these frequencies, the function () is real and positive, then k2() = 2()() < 0,
and k may be represented as i/ with a real , meaning the exponential field decay into the medium.
However, let us consider the case when both () < 0 and () < 0 at a certain frequency. (This is
possible in a medium with both E-driven and H-driven resonances, at a proper choice of their resonance
frequencies.) Since in this case k2() = 2()() > 0, the wave vector is real, Eq. (79) describes a
traveling wave, and one could think that there is nothing new in this case. Not so!
First of all, for a sinusoidal plane wave (79), the operator is equivalent to the multiplication by
ik. As the Maxwell equations (2a) show, this means that at a fixed direction of vectors E and k, the
simultaneous reversal of signs of and means the reversal of the direction of the vector H. Namely, if
both and are positive, these equations are satisfied with mutually orthogonal vectors {E, H, k}
forming the usual, right-hand system (see Fig. 1 and Fig. 12a), the name stemming from the popular
“right-hand rule” used to determine the vector product’s direction. However, if both and are
negative, the vectors form a left-hand system – see Fig. 12b. (Due to this fact, the media with < 0 and
< 0 are frequently called the left-handed materials, LHM for short.) According to the basic relation
(6.114), which does not involve media parameters, this means that for a plane wave in a left-hand
material, the Poynting vector S = EH, i.e. the energy flow, is directed opposite to the wave vector k.
(a) (b)
S S
k
H H Fig. 7.12. Directions of the main
vectors of a plane wave inside a
medium with (a) positive and (b)
E k negative values of and .
E
This fact may look strange but is in no contradiction with any fundamental principle. Let me
remind you that, according to the definition of the vector k, its direction shows the direction of the phase
velocity vph = /k of a sinusoidal (and hence infinitely long) wave, which cannot be used, for example,
for signaling. Such signaling (by sending wave packets – see Fig. 13) is possible only with the group
velocity vgr = d/dk. This velocity in left-hand materials is always directed (as in the right-hand
materials) along the vector S, i.e. along the wave’s energy flow.
f ( z, t )
v ph v gr
Chapter 7 Page 27 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Perhaps the most fascinating effect possible with left-hand materials is the wave refraction at
their interfaces with the usual, right-handed materials – first predicted by V. Veselago in 1960. Consider
the example shown in Fig. 14a. In the incident wave, arriving from a usual material, the directions of
the vectors k– and S– coincide, and so they are in the reflected wave with vectors k’– and S’–. This
means that the electric and magnetic fields in the interface plane (z = 0) are, at our choice of the
coordinate axes, proportional to exp{ikxx}, with a positive component kx = k–cos . To satisfy any linear
boundary conditions, the refracted wave, propagating into the left-handed material, has to match that
dependence, i.e. have a positive x-component of its wave vector k+. But in this medium, this vector has
to be antiparallel to the vector S, which in turn should be directed out of the interface, because it
represents the power flow from the interface into the material’s bulk. These conditions cannot be
reconciled by the refracted wave propagating along the usual Snell-law direction (shown with the
dashed line in Fig. 13a), but are all satisfied at refraction in the direction given by Snell’s angle with the
opposite sign. (Hence the term “negative refraction”).40
z (a) (b)
k
r r image
S b
0, 0 bd a
kx x d 2d
0, 0 a
S
'
S' ad
k k object
Fig. 7.14. Negative refraction: (a) waves at the interface between media with positive and negative values
of , and (b) the hypothetical perfect lens: a parallel plate made of a material with = –0 and = –0.
In order to understand how unusual the results of the negative refraction may be, let us consider
a parallel slab of thickness d, made of a hypothetical left-handed material with exactly selected values
= –0, and = –0 (see Fig. 14b). For such a material, placed in free space, the refraction angle r = –,
so the rays from a point source, located in free space at a distance a < d from the slab’s surface,
propagate as shown on that panel, i.e. all meet again at the distance a beyond the surface, and then
continue to propagate to the second surface of the slab. Repeating our discussion for this surface, we see
that a point’s image is also formed beyond the slab, at distance 2a + 2b = 2a + 2(d – a) = 2d from the
object.
Superficially, this system looks like a usual lens, but the well-known lens formula, which relates
a and b with the focal length f, is not satisfied. (In particular, a parallel beam is not focused into a point
at any finite distance.) As an additional difference from the usual lens, the system shown in Fig. 14b
does not reflect any part of the incident light. Indeed, it is straightforward to check that for all the above
formulas for R and T to be valid, the sign of the wave impedance Z in left-handed materials has to be
kept positive. Thus, for our particular choice of parameters ( = –0, = –0), Eqs. (91a) and (95a) are
40 In some publications inspired by this fact, the left-hand materials are prescribed a negative index of refraction
n. However, this prescription should be treated with care. For example, it complies with the first form of Eq. (84),
but not its second form, and the sign of n, in contrast to that of the wave vector k, is a matter of convention.
Chapter 7 Page 28 of 70
Essential Graduate Physics EM: Classical Electrodynamics
valid with Z+ = Z- = Z0 and cos r = cos = 1, giving R = 0 for any linear polarization, and hence for any
other wave polarization – circular, elliptic, natural, etc.
The perfect lens suggestion has triggered a wave of efforts to implement left-hand materials
experimentally. (Attempts to find such materials in nature have failed so far.) Most progress in this
direction has been achieved using the so-called metamaterials, which are essentially quasi-periodic
arrays of specially designed electromagnetic resonators, ideally with high density n >> -3. For example,
Fig. 15 shows the metamaterial that was used for the first demonstration of negative refractivity in the
microwave region – for ~10-GHz waves.41 It combines straight strips of a metallic film, working as
lumped resonators with a large electric dipole moment (and hence strongly coupled to the wave’s
electric field E), and several almost-closed film loops (so-called split rings), working as lumped
resonators with large magnetic dipole moments, strongly coupled to the field H. The negative
refractivity is achieved by designing the resonance frequencies close to each other. More recently,
metamaterials with negative refractivity were demonstrated in the optical range as well,42 although to
the best of my knowledge, their relatively large absorption still prevents practical applications.
This progress has stimulated the development of other potential uses of metamaterials (not
necessarily the left-handed ones), in particular, designs of nonuniform systems with handcrafted
distributions (r, ) and (r, ) that may provide electromagnetic wave propagation along the desired
paths, e.g., around a certain region of space, making it virtually invisible for an external observer – so
far, within very limited frequency ranges.43
As was mentioned in Sec. 5.5, another way to reach negative values of () is to place a
ferromagnetic material into such an external dc magnetic field that the frequency r of the ferromagnetic
resonance is somewhat lower than . If thin layers of such a material (e.g., nickel) are interleaved with
layers of a non-magnetic good conductor (such as copper), the average value of () of the resulting
metamaterial may be positive but substantially below 0. According to Eq. (6.33), the skin-depth s of
such a material may be larger than that of the good conductor alone, enforcing a more uniform
distribution of the ac current flowing along the layers, and hence making the energy losses lower than in
the good conductor alone. This effect may be useful, in particular, for electronic circuit interconnects.44
41 R. Shelby et al., Science 292, 77 (2001); J. Wilson and Z. Schwartz, Appl. Phys. Lett. 86, 021113 (2005).
42 See, e.g., J. Valentine et al., Nature 455, 376 (2008).
43 For a review of such “invisibility cloaks”, see, e.g., B. Wood, Comptes Rendus Physique 10, 379 (2009).
44 See, for example, N. Sato et al., J. Appl. Phys. 111, 07A501 (2012), and references therein.
Chapter 7 Page 29 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Ezn z E
E(r , t ) Re E ( x, y ) e
i ( k z z t )
,
H (r , t ) Re H ( x, y ) e
i ( k z z t )
, (7.98)
with real kz, where the z-axis is directed along the transmission line – see Fig. 16. Note that this form
allows a substantial coordinate dependence of the electric and magnetic field within the plane [x, y] of
the transmission line’s cross-section, as well as nonvanishing longitudinal components Ez and/or Hz of
the fields, so the solution (98) is substantially more general than the plane waves discussed above. We
will see in a minute that as a result, the parameter kz may be very much different from its plane-wave
value (13), k ()1/2, in the same material, at the same frequency.
In order to describe these effects quantitatively, let us decompose the complex amplitudes of the
wave’s fields into their longitudinal and transverse components (Fig. 16):46
E Ez n z Et , H H z n z H t . (7.99)
45 Another popular term is the waveguide, but it is typically reserved for the transmission lines with singly-
connected cross-sections, to be analyzed in the next section. The first structure for guiding waves was proposed
by J. J. Thomson in 1893, and experimentally tested by O. Lodge in 1894.
46 For the notation simplicity, I am dropping index in the complex amplitudes of the field components, and also
have dropped the argument in kz and Z, even though these parameters may depend on the wave’s frequency
rather substantially – see below.
Chapter 7 Page 30 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Plugging Eqs. (98)-(99) into the source-free Maxwell equations (2), and requiring the longitudinal and
transverse components to be balanced separately, we get
ik z n z E t iH t t E z n z , ik z n z H t iE t t H z n z ,
t E t iH z n z , t H t iE z n z , (7.100)
t E t ik z E z , t H t ik z H z .
where t is the 2D del operator acting in the transverse plane [x, y] only, i.e. the usual , but with /z
= 0. The system (100) looks even bulkier than the original equations (2), but it is much simpler for
analysis. Indeed, by eliminating the transverse components from these equations (or, even simpler, just
by plugging Eq. (99) into Eqs. (3) and keeping only their z-components), we get a pair of self-consistent
equations for the longitudinal components of the fields, 47
2D Helmholtz
equations for
Ez and Hz
2
t
k t2 E z 0, 2
t
k t2 H z 0 , (7.101)
After the distributions Ez(x,y) and Hz(x,y) have been found from these equations, they provide right-hand
sides for the rather simple, closed system of equations (100) for the transverse components of field
vectors. Moreover, as we will see below, each of the following three types of solutions:
(i) with Ez = 0 and Hz = 0 (called the transverse electromagnetic, or TEM waves),
(ii) with Ez = 0, but Hz 0 (called either the TE waves or, more frequently, H-modes), and
(iii) with Ez 0, but Hz = 0 (the so-called TM waves or E-modes),
has its own dispersion law and hence its own wave propagation velocity; as a result, these modes (i.e.
the field distribution patterns) may be considered separately.
In the balance of this section, we will focus on the simplest, TEM waves (i), with no longitudinal
components of either field. For them, the top two equations of the system (100) immediately give Eqs.
(6) and (13), and kz = k. In plain English, this means that E = Et and H = Ht are proportional to each
other and are mutually perpendicular (just as in the plane wave) at each point of the cross-section and
that the TEM wave’s impedance Z E/H and dispersion law (k), and hence the propagation speed, are
the same as in a plane wave in the same material. In particular, if and are frequency-independent
within a certain frequency range, the dispersion law within this range is linear, = k/()1/2, and the
wave’s speed does not depend on its frequency. For practical applications to telecommunications, this is
a very important advantage of the TEM waves over their TM and TE counterparts – to be discussed in
the next sections.
Unfortunately for practice, such waves cannot propagate in every transmission line. To show
this, let us have a look at the two last lines of Eqs. (100). For the TEM waves (Ez = 0, Hz = 0, kz = k),
they are reduced to merely
47The wave equation represented in the form (101), even with the 3D Laplace operator, is called the Helmholtz
equation, named after Hermann von Helmholtz (1821-1894) – the mentor of H. Hertz and M. Planck, among
many others.
Chapter 7 Page 31 of 70
Essential Graduate Physics EM: Classical Electrodynamics
t E t 0, t H t 0,
(7.103)
t E t 0, t H t 0.
Within the coarse-grain description of the conducting walls of the line (i. e., neglecting not only the
screening depth but also the skin depth in comparison with the cross-section dimensions), we have to
require that inside them, E = H = 0. Close to a wall but outside it, the normal component En of the
electric field may be different from zero, because surface charges may sustain its jump – see Sec. 2.1, in
particular Eq. (2.3). Similarly, the tangential component H of the magnetic field may have a finite jump
at the surface due to skin currents – see Sec. 6.3, in particular Eq. (6.38). However, the tangential
component of the electric field and the normal component of the magnetic field cannot experience such
jumps, and to have them equal to zero inside the walls they have to equal zero just outside the walls as
well:
E 0, Hn 0 . (7.104)
But the left columns of Eqs. (103)-(104) coincide with the formulation of the 2D boundary
problem of electrostatics for the electric field induced by electric charges of the conducting walls, with
the only difference that in our current case, the value of actually means (). Similarly, the right
columns of those relations coincide with the formulation of the 2D boundary problem of magnetostatics
for the magnetic field induced by currents in the walls, with (), with the only difference is that in
our current coarse-grain approximation, the magnetic fields cannot penetrate into the conductors.
Now we immediately see that in waveguides with a singly-connected wall, for example, a hollow
conducting tube (see, e.g., Fig. 16), the TEM waves are impossible, because there is no way to create a
non-zero electrostatic field inside a conductor with such cross-section. However, such fields (and hence
the TEM waves) are possible in structures with cross-sections consisting of two or more disconnected
(galvanically-insulated) parts – see, e.g., Fig. 17.
Ht C
Fig. 7.17. An example of the cross-
section of a transmission line that may
support the TEM wave propagation.
Et
In order to derive “global” relations for such a transmission line, let us consider the contour C
drawn very close to the surface of one of its conductors – see, e.g., the red dashed line in Fig. 17. We
can consider it, on one hand, as the cross-section of a cylindrically-shaped Gaussian volume of a certain
elementary length dz << 2/k. Using the generalized Gauss law (3.34), we get
E dr , (7.105)
t n
C
where (not to be confused with the wavelength !) is the complex amplitude of the linear density of
the electric charge of the conductor. On the other hand, the same contour C may be used in the
generalized Ampère law (5.116) to write
Chapter 7 Page 32 of 70
Essential Graduate Physics EM: Classical Electrodynamics
H dr I ,
C
t (7.106)
where I is the total current flowing along the conductor (or rather its complex amplitude). But, as was
mentioned above, in the TEM wave the ratio Et/Ht of the field components participating in these two
integrals is constant and equal to Z = (/)1/2, so Eqs. (105)-(106) give the following simple relation
between the “global” variables of the conductor:
/
I . (7.107)
Z 1/ 2
k
This important relation may be also obtained in a different way; let me describe it as well,
because (as we will see below) it has an independent heuristic value. Let us consider a small segment dz
<< = 2/k of the line’s conductor, and apply the electric charge conservation law (4.1) to the instant
values of the linear charge density and current. The cancellation of dz in both parts yields
( z , t ) I ( z , t )
. (7.108)
t z
If we accept the sinusoidal waveform, exp{i(kz – t)}, for both these variables, we immediately recover
Eq. (107) for their complex amplitudes, showing that this relation expresses just the charge continuity
law.
The global equation (108) may be made more specific in the case when the frequency
dependence of and is negligible, and the transmission line consists of just two isolated conductors –
see, e.g., Fig. 17. In this case, to have the wave localized in the space near the two conductors, we need
a sufficiently fast decrease of its electric field at large distances. For that, their linear charge densities for
each value of z should be equal and opposite, and we can simply relate them to the potential difference V
between the conductors:
( z, t )
C0 , (7.109)
V ( z, t )
where C0 is the mutual capacitance of the conductors (per unit length) – which was repeatedly discussed
in Chapter 2. Then Eq. (108) takes the following form:
V ( z , t ) I ( z , t )
C0 . (7.110)
t z
Next, let us consider the contour shown with the red dashed line in Fig. 18 (which shows a
different cross-section of the transmission line – by a plane containing the wave propagation axis z), and
apply to it the Faraday induction law (6.3).
I
I ( z, t ) I ( z, t ) dz
z
V
V ( z, t ) d L0 I ( z , t ) dz V ( z, t ) dz
z Fig. 7.18. Electric current, magnetic flux, and
voltage in a two-conductor transmission line.
dz
Chapter 7 Page 33 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Since, in the coarse-grain approximation, the electric field inside the conductors (in Fig. 18, on
the horizontal segments of the contour) vanishes, the total e.m.f. equals the difference of the voltages V
at the ends of the segment dz, while the only sources of the magnetic flux through the area limited by the
contour are the (equal and opposite) currents I in the conductors, we can use Eq. (5.70) to express the
flux. As a result, by canceling dz in both parts of the equation, we get
I ( z , t ) V ( z , t )
L0 , (7.111)
t z
where L0 is the mutual inductance of the conductors per unit length. The only difference between this L0
and the dc mutual inductances discussed in Chapter 5 is that at the high frequencies we are analyzing
now, L0 should be calculated neglecting the magnetic field penetration into the conductors. (In the dc
case, we had the same situation for superconductor electrodes within their coarse-grain, ideal-diamagnet
description.)
The system of Eqs. (110) and (111) is frequently called the telegrapher’s equations. Combined,
they give for any “global” variable f (either V, or I, or ) the usual 1D wave equation,
2 f 2 f
L C
0 0 0, (7.112)
z 2 t 2
which describes dispersion-free TEM wave’s propagation. Again, this equation is only valid within the
frequency range where the frequency dependence of both and is negligible. If this is not so, the
global approach may still be used for sinusoidal waves f = Re[fexp{i(kz – t)}]. Repeating the above
arguments, instead of Eqs. (110)-(111) we get a more general system of two algebraic equations
in which L0 and C0 may now depend on frequency. These equations are consistent only if
k2 1 L0C0
L0 C 0 . (7.114) product
2 v2 invariance
Besides the fact we have already known (that the TEM wave’s speed is the same as that of the plane
wave), Eq. (114) gives us the result that I confess was not emphasized enough in Chapter 5: the product
L0C0 does not depend on the shape or size of line’s cross-section, provided that the magnetic field’s
penetration into the conductors is negligible). Hence, if we have calculated the mutual capacitance C0 of
a system of two cylindrical conductors, the result immediately gives us their mutual inductance: L0 =
/C0. This relationship stems from the fact that both the electric and magnetic fields may be expressed
via the solution of the same 2D Laplace equation for the system’s cross-section.
With Eq. (114) satisfied, any of Eqs. (113) gives the same result for the following ratio:
1/ 2
V L Transmission
ZW 0 , (7.115) line’s TEM
I C0 Impedance
which is called the transmission line’s impedance. This parameter has the same dimensionality (in SI
units – ohms, denoted ) as the wave impedance (7),
Chapter 7 Page 34 of 70
Essential Graduate Physics EM: Classical Electrodynamics
1/ 2
E
Z , (7.116)
H
but these parameters should not be confused, because ZW depends on the cross-section’s geometry, while
Z does not. In particular, ZW is the only important parameter of a transmission line for its matching with
a lumped load circuit (Fig. 19) in the important case when both the cable cross-section’s size and the
load’s linear dimensions are much smaller than the wavelength.48
Indeed, in this case, we may consider the load in the quasistatic limit and write
V ( z0 ) Z L ( ) I ( z0 ) , (7.117)
where ZL() is the (generally complex) impedance of the load. Taking V(z,t) and I(z,t) in the form
similar to Eqs. (61) and (62), and writing the two Kirchhoff’s circuit laws for the point z = z0, we get for
the reflection coefficient a result similar to Eq. (68):
Z L ( ) ZW
R . (7.118)
Z L ( ) ZW
This formula shows that for the perfect matching (i.e. the total wave absorption in the load), the load’s
impedance ZL() should be real and equal to ZW – but not necessarily to Z.
As an example, let us consider one of the simplest (and most practically important) transmission
lines: the coaxial cable (Fig. 20).49
, ba
0 a
Fig. 7. 20. The cross-section of a coaxial cable
with (possibly, dispersive) dielectric filling.
For this geometry, we already know the expressions for both C0 and L0,50 though they have to be
modified for the account of arbitrary dielectric and magnetic constants, and the magnetic field’s non-
penetration into the conductors. As a result of this (elementary) modification, we get the formulas,
48 The ability of TEM lines to have such a small cross-section is another important practical advantage.
49 Itwas invented by the same O. Heaviside in 1880.
50 See, respectively, Eqs. (2.49) and (5.79).
Chapter 7 Page 35 of 70
Essential Graduate Physics EM: Classical Electrodynamics
2 Coaxial
C0 , L0 ln(b / a) , (7.119) cable’s
ln(b / a) 2 C0 and L0
illustrating that the universal relationship (114) is indeed valid. For the cable’s impedance (115), Eqs.
(119) yield a geometry-dependent value
1/ 2
ln(b / a) ln(b / a)
ZW Z Z. (7.120)
2 2
For the standard TV antenna cables (such as RG-6/U, with b/a ~ 3, /0 2.2), ZW = 75 , while for
most computer component connections, coaxial cables with ZW = 50 (such as RG-58/U) are
prescribed by electronic engineering standards. Such cables are broadly used for the transmission of
electromagnetic waves with frequencies up to 1 GHz over distances of a few km, and up to ~20 GHz on
the tabletop scale (a few meters), limited by wave attenuation – see Sec. 9 below.
Moreover, the following two facts enable a wide application, in electrical engineering and
physical experiment, of coaxial-cable-like systems. First, as Eq. (5.78) shows, in a cable with a << b,
most energy of the wave is localized near the internal conductor. Second, the theory to be discussed in
the next section shows that excitation of other (H- and E-) waves in the cable is impossible until the
wavelength becomes smaller than ~(a + b). As a result, the TEM mode propagation in a cable with a
<< b < / is not much affected even if the internal conductor is not straight, but bent – for example,
into a helix – see, e.g., Fig. 21.
In such a system, called the traveling-wave tube (TWT), a quasi-TEM wave propagates with
velocity v c along the helix’s length, so the velocity’s component along the cable’s axis may be made
close to the velocity u << c of the electron beam moving ballistically along the tube’s axis, enabling
their effective interaction, and as a result, a length-accumulating amplification of the wave.51
Another important example of a TEM transmission line is a set of two parallel wires. In the form
of twisted pairs,52 they allow communications, in particular long-range telephone and DSL Internet
Despite the current prevalence of semiconductor devices in electronics, TWTs are still used in satellite
51
TV and radio systems, because they may work at very high microwave power – e.g., up to 200W at 20
GHz and pulsed 50W at 200 GHz. Very unfortunately, in this course, I will not have time/space to discuss
even the (rather elegant) basic theory of such devices. The reader interested in this field may be referred, for
example, to the detailed monograph by J. Whitaker, Power Vacuum Tubes Handbook, 3rd ed., CRC Press, 2017.
52 Such twisting, around the line’s direction axis, reduces the crosstalk between adjacent lines, and the parasitic
radiation at their bends.
Chapter 7 Page 36 of 70
Essential Graduate Physics EM: Classical Electrodynamics
connections, at frequencies up to a few hundred kHz, as well as relatively short, multi-line Ethernet and
TV cables at frequencies up to ~ 1 GHz, limited mostly by the mutual interference (“crosstalk”) between
the individual lines of the same cable, and the unintentional radiation of the wave into the environment.
53 For the derivation of Eqs. (121), one of these two linear equations should be first vector-multiplied by nz. Note
also that this approach could not be used to analyze the TEM waves, because for them kt = 0, Ez = 0, Hz = 0, and
Eqs. (121) yield uncertainty.
54 An interesting twist in the ideas of electromagnetic metamaterials (mentioned in Sec. 5 above) is the so-called
-near-zero materials, designed to have the effective product much lower than 00 within certain frequency
ranges. Since at these frequencies, the speed v (4) becomes much lower than c, the cutoff frequency (123)
Chapter 7 Page 37 of 70
Essential Graduate Physics EM: Classical Electrodynamics
values of c present special practical interest, because the choice of the signal frequency between the
two lowest values of the cutoff frequency (123) guarantees that the waves propagate in the form of only
one mode, with the lowest kt. Such a choice enables engineers to simplify the excitation of the desired
mode by wave generators and to avoid the unintentional transfer of electromagnetic wave energy to
undesirable modes by (virtually unavoidable) small inhomogeneities of the system.
The boundary conditions for the Helmholtz equations (101) depend on the propagating wave
type. For the E-modes, with Hz = 0 but Ez 0, the condition E = 0 immediately gives
Ez C 0, (7.124)
where C is the inner contour limiting the conducting wall’s cross-section. For the H-modes, with Ez = 0
but Hz 0, the boundary condition is slightly less obvious and may be obtained using, for example, the
second equation of the system (100), vector-multiplied by nz. Indeed, for the component normal to the
conductor surface, the result of such multiplication is
ik z (H t ) n i
k
n z E t n H z . (7.125)
Z n
But the first term on the left-hand side of this relation must be zero on the wall surface, because of the
second of Eqs. (104), while according to the first of Eqs. (104), the vector Et in the second term cannot
have a component tangential to the wall. As a result, the vector product in that term cannot have a
normal component, so the term should equal zero as well, and Eq. (125) is reduced to
H z
C 0. (7.126)
n
Let us see how all this machinery works for a simple but practically important case of a metallic-
wall waveguide with a rectangular cross-section – see Fig. 22
y
b
Ht
Fig. 7.22. A rectangular waveguide, and the
E transverse field distribution in its
0 a fundamental mode H10 (schematically).
x
In the natural Cartesian coordinates shown in this figure, both Eqs. (101) take the simple form
2 2 E , for E - modes,
2 2 k t2 f 0, where f z (7.127)
x y H z , for H - modes.
From Chapter 2, we know that the most effective way of solving such equations in a rectangular region
is the variable separation, in which the general solution is represented as a sum of partial solutions of the
type
virtually vanishes. As a result, the waves may “tunnel” through very narrow sections of metallic waveguides filled
with such materials – see, e.g., M. Silveirinha and N. Engheta, Phys. Rev. Lett. 97, 157403 (2006).
Chapter 7 Page 38 of 70
Essential Graduate Physics EM: Classical Electrodynamics
f X ( x)Y ( y ) . (7.128)
Plugging this expression into Eq. (127), and dividing each term by XY, we get the equation,
1 d 2 X 1 d 2Y
k t2 0 , (7.129)
X dx 2 Y dy 2
which should be satisfied for all values of x and y within the waveguide’s interior. This is only possible
if each term of the sum equals a constant. Taking the X-term and Y-term constants in the form (–kx2) and
(–ky2), respectfully, and solving the corresponding ordinary differential equations,55 for the
eigenfunction (128) we get
where the constants c and s should be found from the boundary conditions. Here the difference between
the H-modes and E-modes kicks in.
For the H-modes, Eq. (130) is valid for Hz, and we should use the boundary condition (126) on
all metallic walls of the waveguide, i.e. at x = 0 and a; and y = 0 and b – see Fig. 22. As a result, we get
very simple expressions for eigenfunctions and eigenvalues:
nx my
H z nm H l cos cos , (7.131)
a b
1/ 2
n m n 2 m 2
kx
a
, ky
b
, so that (k t ) nm k k 2
x y
2 1/ 2
, (7.132)
a b
where Hl is the longitudinal field’s amplitude, and n and m are two integer numbers – each of them
arbitrary besides that they cannot be equal to zero simultaneously.56 Assuming, just for certainty, that a
b (as shown in Fig. 22), we see that the lowest eigenvalue of kt and hence the lowest cutoff frequency
(123) are achieved for the so-called H10 mode with n = 1 and m = 0, and hence with
Fundamental
mode’s (k t )10 , (7.133)
cutoff a
thus confirming our prior estimate of kt.
Depending on the a/b ratio, the second-lowest kt (and hence c) belongs to either the H11 mode
with n = 1 and m = 1:
1/ 2
1 1
1/ 2
a 2
(k t )11 2 2 1 (k t )10 , (7.134)
a b b
or to the H20 mode with n = 2 and m = 0:
55 Let me hope that the solution of equations of the type d2X/dx2 + kx2X = 0 does not present any problem for the
reader, at least due to their prior experience with problems such as standing waves on a guitar string,
wavefunctions in a flat 1D quantum well, or (with the replacement x t) a classical harmonic oscillator.
56 Otherwise, the function H (x,y) would be constant, so, according to Eq. (121), the transverse components of the
z
electric and magnetic field would equal zero. As a result, as the last two lines of Eqs. (100) show, the whole field
would be zero for any kz 0.
Chapter 7 Page 39 of 70
Essential Graduate Physics EM: Classical Electrodynamics
2
(k t ) 20 2(k t )10 . (7.135)
a
These values become equal at a/b = 3 1.7; in practical waveguides, the a/b ratio is not too far from
this value. For example, in the standard X-band (~10-GHz) waveguide WR90, a 2.3 cm (fc c/2
6.5 GHz), and b 1.0 cm.
Now let us have a look at the alternative E-modes. For them, we still should use the general
solution (130) with f = Ez, but now with the boundary condition (124). This gives us the eigenfunctions
nx my
E z nm El sin sin , (7.136)
a b
and the same eigenvalue spectrum (132) as for the H modes. However, now neither n nor m can be equal
to zero; otherwise, Eq. (136) would give the trivial solution Ez(x,y) = 0. Hence the lowest cutoff
frequency of TM waves is achieved at the so-called E11 mode with n =1, m = 1, and with the eigenvalue
given by Eq. (134), always higher than (kt)10.
Thus the fundamental H10 mode is certainly the most important wave in rectangular waveguides;
let us have a better look at this field distribution. Plugging the corresponding solution (131) with n = 1
and m = 0 into the general relation (121), we easily get
kza x
( H x )10 i H l sin , ( H y )10 0, (7.137)
a
ka x
( E x )10 0, ( E y )10 i . ZH l sin (7.138)
a
This field distribution is (schematically) shown in Fig. 22. Neither of the fields depends on the
coordinate y – the feature very convenient, in particular, for microwave experiments with small samples.
The electric field has only one (in Fig. 22, vertical) component that vanishes at the side walls and
reaches its maximum at the waveguide’s center; its field lines are straight, starting and ending on wall
surface charges (whose distribution propagates along the waveguide together with the wave). In
contrast, the magnetic field has two non-zero components (Hx and Hz), and its field lines are shaped as
horizontal loops wrapped around the electric field maxima.
An important question is whether the H10 wave may be usefully characterized by a unique
impedance introduced similarly to ZW of the TEM modes – see Eq. (115). The answer is not, because the
main value of ZW is a convenient description of the impedance matching of a transmission line with a
lumped load – see Fig. 19 and Eq. (118). As was discussed above, such a simple description is possible
(i.e., does not depend on the exact geometry of the connection) only if both dimensions of the line’s
cross-section are much less than . But for the H10 wave (and more generally, any non-TEM mode) this
is impossible – see, e.g., Eq. (129): its lowest frequency corresponds to the TEM wavelength max =
2/(kt)min = 2/(kt)10 = 2a. (The reader is challenged to find a simple interpretation of this equality.)
Now let us consider metallic-wall waveguides with a round cross-section (Fig. 23a). In this
single-connected geometry, the TEM waves are impossible again, while for the analysis of H-modes and
E-modes, the polar coordinates {, } are most natural. In these coordinates, the 2D Helmholtz equation
(101) takes the following form:
Chapter 7 Page 40 of 70
Essential Graduate Physics EM: Classical Electrodynamics
1 1 2 E , for E - modes,
2 k t2 f 0, where f z (7.139)
2
H z , for H - modes.
Separating the variables as f = R()F(), we get
1 d dR 1 d 2F
2 k t2 0 . (7.140)
R d d F d
2
But this is exactly the Eq. (2.127) that was studied in Sec. 2.7 in the context of electrostatics, just with a
replacement of notation: kt. So we already know that to have 2-periodic functions F() and finite
values R(0) (which are evidently necessary for our current case – see Fig. 23a), the general solution
must have the form given by Eq. (2.136), i.e. the eigenfunctions are expressed via integer-order Bessel
functions of the first kind:
f nm J n k nm c n cos n s n sin n const J n k nm cos n 0 , (7.141)
with the eigenvalues knm of the transverse wave number kt to be determined from appropriate boundary
conditions, and an arbitrary constant 0.
(a) , (b)
R ,
R
0 0
Fig. 7.23. (a) Metallic and (b) dielectric
, waveguides with circular cross-sections.
As for the rectangular waveguide, let us start with the H-modes (f = Hz). Then the boundary
condition on the wall surface ( = R) is given by Eq. (126), which, for the solution (141), takes the form
d
J n ( ) 0, where kR . (7.142)
d
This means that the eigenvalues of Eq. (139) are
' nm
k t k nm
, (7.143)
R
where ’nm is the mth zero of the function dJn()/d. Approximate values of these zeros for several
lowest n and m may be read out from Fig. 2.18; their more accurate values are given in Table 1 below.
Table 7.1. Zeros ’nm of the function dJn()/d for a few lowest
values of the Bessel function’s index n and the root’s number m.
m=1 2 3
n=0 3.83171 7.015587 10.1735
1 1.84118 5.33144 8.53632
2 3.05424 6.70613 9.96947
3 4.20119 8.01524 11.34592
Chapter 7 Page 41 of 70
Essential Graduate Physics EM: Classical Electrodynamics
The table shows, in particular, that the lowest of the zeros is ’11 1.84.57 Thus, perhaps a bit
counter-intuitively, the fundamental mode, providing the lowest cutoff frequency c = vknm, is H11,
corresponding to n = 1 rather than n = 0:
H z H l J 1 11' cos 0 . (7.144)
R
It has the transverse wave number is kt = k11 = ’11/R 1.84/R, and hence the cutoff frequency
corresponding to the TEM wavelength max = 2/k11 3.41 R. Thus the ratio of max to the waveguide’s
diameter 2R is about 1.7, i.e. is close to the ratio max/a = 2 for the rectangular waveguide. The origin of
this proximity is clear from Fig. 24, which shows the transverse field distribution in the H11 mode. (It
may be readily calculated from Eqs. (121) with Ez = 0 and Hz given by Eq. (144).)
One can see that the field structure is actually very similar to that of the fundamental mode in the
rectangular waveguide, shown in Fig. 22, despite the different nomenclature (which is due to the
different coordinate system used for the solution). However, note the arbitrary constant angle 0,
indicating that in circular waveguides, the transverse field’s polarization is arbitrary. For some practical
applications, such degeneracy of these “quasi-linearly-polarized” waves creates problems; some of them
may be avoided by using waves with circular polarization.
As Table 1 shows, the next lowest H-mode is H21, for which kt = k21 = ’21/R 3.05/R, almost
twice larger than that of the fundamental mode, and only then comes the first mode with no angular
dependence of any field, H01, with kt = k01 = ’01/R 3.83/R,58 followed by several angle-dependent
modes: H31, H12, etc.
For the E modes, we may still use Eq. (141) (with f = Ez), but with the boundary condition (124)
at = R. This gives the following equation for the problem eigenvalues:
nm
J n k nm R 0, i.e. k nm , (7.145)
R
where nm is the mth zero of function Jn() – see Table 2.1. That table shows that the lowest kt is equal to
01/R 2.405/R. Hence the corresponding mode (E01), with no angular dependence of its fields, e.g.
57 Mathematically, the lowest root of Eq. (142) with n = 0 equals 0. However, it would yield k = 0 and hence a
constant field Hz, which, according to the first of Eqs. (121), would give a vanishing electric field.
58 The electric field lines in the H
01 mode (as well as all higher H0m modes) are directed straight from the
symmetry axis to the walls, reminding those of the TEM waves in the coaxial cable. Due to this property, these
modes provide, at >> c, much lower energy losses (see Sec. 9 below) than the fundamental H11 mode, and are
sometimes used in practice, despite the inconvenience of working in the multimode frequency range.
Chapter 7 Page 42 of 70
Essential Graduate Physics EM: Classical Electrodynamics
E z El J 0 01 , (7.146)
R
has the second-lowest cutoff frequency, ~30% higher than that of the fundamental mode H11.
Finally, let us discuss one more topic of general importance – the number N of electromagnetic
modes that may propagate in a waveguide within a certain range of relatively large frequencies >> c.
It is easy to calculate for a rectangular waveguide, with its simple expressions (132) for the eigenvalues
of {kx, ky}. Indeed, these expressions describe a rectangular mesh on the [kx, ky] plane, so each point
corresponds to the plane area Ak = (/a)(/b), and the number of modes in a large k-plane area Ak >>
Ak is N = Ak/Ak = abAk/2 = AAk/2, where A is the waveguide’s cross-section area.59 However, it is
frequently more convenient to discuss transverse wave vectors kt of arbitrary direction, i.e. with an
arbitrary sign of their components kx and ky. Taking into account that the opposite values of each
component actually give the same wave, the actual number of different modes of each type (E- or H-) is
a factor of 22 4 lower than was calculated above. This means that the number of modes of both types is
Ak A
N 2 . (7.147)
(2 ) 2
Let me leave it for the reader to find hand-waving (but convincing :-) arguments that this mode
counting rule is valid for waveguides with cross-sections of any shape, and any boundary conditions on
the walls, provided that N >> 1.
at the same frequency. (In most cases the difference is achieved due to that in the electric permittivity, +
< -, while magnetically both materials are virtually passive: - + 0, so their refraction indices n,
defined by Eq. (84), are very close to (/0)1/2; I will limit my discussion to this approximation.)
The basic idea of the waveguide’s operation may be readily understood in the limit when the
wavelength is much smaller than the characteristic size R of the core’s cross-section. In this
“geometric-optics” limit, at distances of the order of from the core-cladding interface, which provides
the wave reflection, we can neglect the interface’s curvature and approximate its geometry with a plane.
As we know from Sec. 4, if the angle of the wave’s incidence on such a plane interface is larger than
the critical value c specified by Eq. (85), the wave is totally reflected. As a result, the waves launched
into the fiber core at such “grazing” angles, propagate inside the core, being repeatedly reflected from
the cladding – see Fig. 25.
59This formula ignores the fact that, according to the above analysis, some modes (with n = 0 and m = 0 for the H
modes, and n = 0 or m = 0 for the E modes) are forbidden. However, for N >> 1, the associated corrections of Eq.
(147) are negligible.
Chapter 7 Page 43 of 70
Essential Graduate Physics EM: Classical Electrodynamics
“cladding” ,
“core” ,
Fig. 7.25. Wave propagation in
a thick optical fiber at > c.
The most important type of dielectric waveguides is optical fibers.60 Due to a heroic
technological effort over three decades starting from the mid-1960s, the attenuation of such fibers has
been decreased from values of the order of 20 db/km (typical for a window glass) to the fantastically
low values of ~0.2 db/km (meaning virtually perfect transparency of 10-km-long fiber segments!),
combined with the extremely low plane-wave (“chromatic”) dispersion below 10 ps/kmnm.61 In
conjunction with the development of inexpensive erbium-based quantum amplifiers, this breakthrough
has enabled inter-city and inter-continental (undersea), broadband62 optical cables, which are the
backbone of all modern telecommunication infrastructure.
The only bad news is that these breakthroughs were achieved for just one kind of materials
(silica-based glasses)63 within a very narrow range of their chemical composition. As a result, the
dielectric constants /0 of the cladding and core of practical optical fibers are both close to 2.2
(giving n 1.5) and hence very close to each other, so the relative difference of the refraction indices,
n n 1 / 2 1 / 2 (7.149)
,
n 1 / 2 2
is typically below 0.5%. This factor limits the fiber bandwidth. Indeed, let us use the geometric-optics
picture to calculate the number of quasi-plane-wave modes that may propagate in the fiber. For the
complementary angle (Fig. 25)
, so that sin cos , (7.150)
2
Eq. (85) gives the following propagation condition:
n
cos 1 . (7.151)
n
60 For a comprehensive discussion of this vital technology see, e.g., A. Yariv and P. Yeh, Photonics, 6th ed.,
Oxford U. Press, 2007.
61 Both these parameters have their best values not in the visible light range (with wavelengths from 380 to 740
nm), but in the near-infrared, with the attenuation lowest between approximately 1,500 and 1,630 nm. As a result,
most modern communication systems use two spectral windows – the so-called C-band (1,530-1,565 nm) and L-
band (1,570-1,610 nm) within that range.
62 Each of the spectral bands mentioned above, at a typical signal-to-noise ratio S/N > 105, corresponds to the
Shannon bandwidth f log2(S/N) exceeding 1014 bits per second, some five orders of magnitude (!) higher than
that of a modern Ethernet cable. The practically usable bandwidth of each fiber is somewhat lower, but a typical
optical cable, with many fibers in parallel, has a proportionately higher aggregate bandwidth. A relatively recent
(circa 2017) example is the C-band transatlantic (6,600-km-long) cable Marea, with eight fiber pairs and an
aggregate useable bandwidth of 160 terabits per second.
63 The silica-based fibers were developed in 1966 by an industrial research group led by Charles Kao (who shared
the 2009 Nobel Prize in physics), but the very idea of using optical fibers for long-range communications may be
traced back at least to the 1963 work by Jun-ichi Nishizawa – who also invented semiconductor lasers.
Chapter 7 Page 44 of 70
Essential Graduate Physics EM: Classical Electrodynamics
In the limit << 1, when the incidence angles > c of all propagating waves are very close to /2, and
hence the complementary angles are small, we may keep only two first terms in the Taylor expansion
of the left-hand side of Eq. (151) and get
max
2
2 . (7.152)
(Even for the higher-end value = 0.005, this critical angle is only ~0.1 radian, i.e. is close to 5.) Due
to this smallness, we may approximate the maximum transverse component of the wave vector as
N 2
R k kR .
2 2 2
max 2 (7.154)
2 2
For typical values k = 0.7310 m (corresponding to the free-space wavelength 0 = n = 2n/k 1.3
7 -1
l k l l l l n l
t z k z 1 sin c 1 . (7.155)
vz v v n v
For the example considered above, the TEM wave’s speed in the glass, v = c/n 2108 m/s, and the
geometric dispersion t/l is close to 25 ps/m, i.e. 25,000 ps/km. (This means, for example, that a 1-ns
pulse, being distributed between the modes, would spread to a ~25-ns pulse after passing a just 1-km
fiber segment.) This result should be compared with the chromatic dispersion mentioned above, below
10 ps/kmnm, which gives dt/l is of the order of only 1,000 ps/km in the whole communication band d
~ 100 nm. Due to this high geometric dispersion, such relatively thick (2R ~ 50 nm) multi-mode fibers
are used for the transfer of signals over only short distances below ~ 100 m. (As compensation, they
may carry relatively large power, beyond 10 mW, without being damaged by the field.)
Long-range telecommunications are based on single-mode fibers, with thin cores (typically with
diameters 2R ~ 5 m, i. e. of the order of /1/2). For such structures, Eq. (154) yields N ~ 1, but in this
case, the geometric optics approximation is not quantitatively valid, and for the fiber analysis, we should
get back to the Maxwell equations. In particular, this analysis should take into explicit account the
evanescent wave in the cladding, because its penetration depth may be comparable with R.64
64 The following quantitative analysis of the single-mode fibers is very valuable – both for practice and as a very
good example of Maxwell equations’ solution. However, I have to confess that its results will not be used in the
following parts of the course. So, if the reader is not interested in this topic, they may safely jump to the text
following Eq. (181). (I believe that the discussion of the angular momentum of electromagnetic radiation, starting
at that point, is compulsory for every professional physicist.)
Chapter 7 Page 45 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Since the cross-section of an optical fiber lacks metallic walls, the Maxwell equations describing
them cannot be exactly satisfied with either TEM-wave, or H-mode, or E-mode solutions. Instead, the
fibers can carry the so-called HE and EH modes, with both vectors H and E having longitudinal
components simultaneously. In such modes, both Ez and Hz inside the core ( R) have a form similar
to Eq. (141):
f f l J n k t cos n 0 , where k t2 k 2 k z2 0, and k 2 2 , (7.156)
where the constant angles 0 may be different for each field. On the other hand, for the evanescent wave
in the cladding, we may rewrite Eqs. (101) as
2
t2 f 0, where t2 k z2 k 2 0, and k 2 2 . (7.157)
Figure 26 illustrates these relations between kt, t, kz, and k; note that the following sum, Universal
relation
k ( ) 0 2k ,
t
2
t
2 2 2
(7.158) between
kt and t
is fixed (at a given frequency) and, for typical fibers, is very small (<< k2). In particular, Fig. 26 shows
that neither kt nor t can be larger than [(- – +)0]1/2 = (2)1/2 k. This means that the depth = 1/t of
the wave penetration into the cladding is at least 1/k(2)1/2 = /2(2)1/2 >> /2. This is why the
cladding layers in practical optical fibers are made as thick as ~50 m, so only a negligibly small tail of
this evanescent wave field reaches their outer surfaces.
k 2 k 2 2 ( ) 0
t2 k t2
Fig. 7.26. The relation between the transverse
exponents kt and t for waves in optical fibers.
k 2 k z2 k 2 k2
Now we have to reconcile Eqs. (156) and (160), using the boundary conditions at = R for both
longitudinal and transverse components of both fields, with the latter components first calculated using
Eqs. (121). Such a conceptually simple, but a bit bulky calculation (which I am leaving for the reader’s
exercise), yields a system of two homogeneous linear equations for the complex amplitudes El and Hl,
which are compatible if
Chapter 7 Page 46 of 70
Essential Graduate Physics EM: Classical Electrodynamics
LHS
01 11 02 12 03
0
Fig. 7.27. Two sides of the characteristic
equation (162), plotted as functions of ktR,
RHS for two values of its dimensionless
parameter: V = 8 (blue line) and V = 3 (red
line). Note that according to Eq. (158), the
argument of the functions K0 and K1 is
3 tR = [V 2 – (ktR)2]1/2 (V 2 – 2)1/2.
0 5 10
kt R
The right-hand side of Eq. (162) depends not only on , but also on the dimensionless parameter
V defined as the normalized right-hand side of Eq. (158):
V 2 2 ( ) 0 R 2 2 k 2 R 2 . (7.163)
(According to Eq. (154), if V >> 1, it gives twice the number N of the fiber modes – the conclusion
confirmed by Fig. 27, taking into account that it describes only the H-modes.) Since the ratio K1/K0 is
positive for all values of the functions’ argument (see, e.g., the right panel of Fig. 2.22), the right-hand
side of Eq. (162) is always negative, so the equation may have solutions only in the intervals where the
ratio J1/J0 is negative, i.e. at
01 k t R 11 , 02 k t R 12 ,... , (7.164)
Chapter 7 Page 47 of 70
Essential Graduate Physics EM: Classical Electrodynamics
where nm is the m-th zero of the function Jn() – see Table 2.1. The right-hand side of the characteristic
equation (162) diverges at tR 0, i.e. at ktR V, so no solutions are possible if V is below the critical
value Vc = 01 2.405. At this cutoff point, Eq. (163) yields k 01/R(2)1/2. Hence, the cutoff
frequency of the lowest H mode corresponds to the TEM wavelength
2R
max 2 1 / 2 3.7 R1 / 2 . (7.165)
01
For typical parameters = 0.005 and R = 2.5 m, this result yields max ~ 0.65 m, corresponding to the
free-space wavelength 0 ~ 1 m. A similar analysis of the first parentheses on the left-hand side of Eq.
(161) shows that at 0, the cutoff frequency for the E modes is similar.
This situation may look exactly like that in metallic-wall waveguides, with no waves possible at
frequencies below c, but this is not so. The basic reason for the difference is that in the metallic
waveguides, the approach to c results in the divergence of the longitudinal wavelength z 2/kz. On
the other hand, in dielectric waveguides, the approach leaves z finite (kz k+). Due to this difference, a
certain linear superposition of HE and EH waves with n = 1 can propagate at frequencies well below the
cutoff frequency for n = 0, which we have just calculated.65 This mode, in the limit + - (i.e. << 1)
allows a very interesting and simple description using the Cartesian (rather than polar) components of
the fields, but still expressed as functions of the polar coordinates and . The reason is that this mode
is very close to a linearly polarized TEM wave. (Due to this reason, this mode is referred to as LP01.)
Let us select the x-axis parallel to the transverse component of the magnetic field vector at = 0,
so Ex=0 = 0, but Ey=0 0, and Hx=0 0, but Hy=0 = 0. The only suitable solutions of the 2D
Helmholtz equation (that should be obeyed not only by the z-components of the fields but also their x-
and y-components) are proportional to J0(kt), with zero coefficients for Ex and Hy:
Now we can use the last two equations of Eqs. (100) to calculate the longitudinal components of the
fields:
1 E y k 1 H x k
Ez i t E 0 J 1 (k t ) sin , Hz i t H 0 J 1 (k t ) cos , (7.167)
ik z y kz ik z x kz
where I have used the following mathematical identities: J0 = –J1’, /x = x/ = cos, and /y = y/
= sin. As a sanity check, we see that the longitudinal component or each field is a (legitimate!)
eigenfunction of the type (141), with n = 1. Note also that if kt << kz (this relation is always true if <<
1 – see either Eq. (158) or Fig. 26), the longitudinal components of the fields are much smaller than their
transverse counterparts, so the wave is indeed very close to the TEM one. Because of that, the ratio of
the electric and magnetic field amplitudes is also close to that in the TEM wave: E0/H0 Z– Z+.
Now to satisfy the boundary conditions at the core-to-cladding interface ( = R), we need to have
a similar angular dependence of these components at R. The longitudinal components of the fields
65This fact becomes less surprising if we recall that in the circular metallic waveguide, discussed in Sec. 6, the
fundamental mode (H11, see Fig. 23) also corresponded to n = 1 rather than n = 0.
Chapter 7 Page 48 of 70
Essential Graduate Physics EM: Classical Electrodynamics
are tangential to the interface and thus should be continuous. Using the solutions similar to Eq. (160)
with n = 1, we get
k t J 1 (k t R) k t J 1 (k t R)
E z i E 0 K 1 ( t ) sin , H z i H 0 K 1 ( t ) cos , for R. (7.168)
k z K 1 ( t R) k z K 1 ( t R)
For the transverse components, we should require the continuity of the normal magnetic field Hn, for
our simple field structure equal to just Hxcos, of the tangential electric field E = Eysin, and of the
normal component of Dn = En = Eycos. Assuming that - = + = 0, and + -–66 we can satisfy these
conditions with the following solutions:
J 0 (k t R) J (k R)
E x 0, E y E 0 K 0 ( t ), H x 0 t H 0 K 0 ( t ), H y 0, for R. (7.169)
K 0 ( t R) K 0 (k t R)
From here, we can calculate components from Ez and Hz, using the same approach as for R:
1 E y J (k R)
Ez i t 0 t E 0 K 1 ( t ) sin ,
ik z y k z K 0 ( t R)
(7.170)
1 H x J (k R)
Hz i t 0 t H 0 K 1 ( t ) cos , for R.
ik z x k z K 0 ( t R)
These relations provide the same functional dependence of the fields as Eqs. (167), i.e. the internal and
external fields are compatible, but their amplitudes at the interface coincide only if
LP01 mode: J 1 (k t R) K ( R)
characteristic kt t 1 t . (7.171)
equation J 0 (k t R) K 0 ( t R )
This characteristic equation (which may be also derived from Eq. (161) with n = 1 in the limit
0) looks close to Eq. (162), but functionally is much different from it – see Fig. 28. Indeed, its right-
hand side is always positive, and the left-hand side tends to zero at ktR 0. As a result, Eq. (171) may
have a solution for arbitrary small values of the parameter V defined by Eq. (163), i.e. for arbitrary low
frequencies (large wavelengths). This is why this mode is used in practical single-mode fibers: there are
no other modes with wavelengths larger than the max given by Eq. (165), so they cannot be
unintentionally excited on small inhomogeneities of the fiber.
It is easy to use the Bessel function approximations by the first terms of the Taylor expansions
(2.132) and (2.157) to show that in the limit V 0, tR tends to zero much faster than ktR V: tR
2exp{-1/ V } << V. This means that the scale c 1/t of the radial distribution of the LP01 wave’s fields
in the cladding becomes very large. In this limit, this mode may be interpreted as a virtually TEM wave
propagating in the cladding, just slightly deformed (and guided) by the fiber’s core. The drawback of
this feature is that it requires very thick cladding, to avoid energy losses in its outer (“buffer” and
“jacket”) layers that defend the silica layers from the elements and mechanical damages, but lack their
66 This is the core assumption of this approximate theory, which accounts only for the most important effect of the
small difference of the dielectric constants + and –: the difference between (k+2 – kz2) = kt2 > 0 and (k-2 – kz2) = –
t2 < 0. For more discussion of the accuracy of this approximation and some exact results, the interested reader
may be referred either to the monograph by A. Snyder and D. Love, Optical Waveguide Theory, Chapman and
Hill, 1983, or to Chapter 3 and Appendix B in the monograph by Yariv and Yeh, that was cited above.
Chapter 7 Page 49 of 70
Essential Graduate Physics EM: Classical Electrodynamics
low optical absorption. Due to this reason, the core radius is usually selected so that the parameter V is
just slightly less than the critical value Vc = 01 2.4 for higher modes, thus ensuring the single-mode
operation.
10
RHS LHS
Fig. 7.28. Two sides of the
0
characteristic equation (171) for the
LP01 mode, plotted as a function of
ktR, for two values of the
dimensionless parameter: V = 8
(blue line) and V = 1 (red line).
10
0 5 10
kt R
In order to reduce the field spread into the cladding, the step-index fibers discussed above may
be replaced with graded-index fibers whose dielectric constant is gradually and slowly decreased from
the center to the periphery.67 Keeping only the main two terms in the Taylor expansion of the function
() at = 0, we may approximate such reduction as68
( ) (0)1 2 , (7.172)
Surprisingly for such an axially-symmetric problem, because of its special dependence on the radius,
this equation may be most readily solved in the Cartesian coordinates. Indeed, rewriting it as
2 2 2
2 2 2
2 2 k t (0) k 0 x y f 0 , (7.174)
x y
and separating the variables as f = X(x)Y(y), we get
67 Due to the difficulty of fabrication of graded-index fibers with wave attenuation below a few dm/km, they are
not used as broadly as the step-index ones.
68 For an axially-symmetric smooth function (), the first derivative d/d always vanishes at = 0, so Eq. (172)
differentiation sign, and the exact Helmholtz-type equations for fields have additional terms containing .
Chapter 7 Page 50 of 70
Essential Graduate Physics EM: Classical Electrodynamics
k t2 (0) k 2 0 x 2 y 2 0 ,
1 d 2 X 1 d 2Y
2
2
(7.175)
X dx Y dy
so the functions X and Y obey similar differential equations, for example
d2X
dx 2
k x2 k 2 0x 2 X 0, (7.176)
k x2 k y2 k t2 (0) k 2 0 k z2 . (7.177)
The ordinary differential equation (176) is well known in quantum mechanics, because the
stationary Schrödinger equation for one of the most important basic quantum systems, a 1D harmonic
oscillator, may be rewritten in this form. Its eigenvalues are very simple:
but the corresponding eigenfunctions Xn(x) and Ym(y) are expressed via not quite elementary functions –
the Hermite polynomials.70 For most practical purposes, however, the lowest eigenfunctions X0(x) and
Y0(y) are sufficient, because they correspond to the lowest kx,y, and hence the lowest
k t
2
0min (k x2 ) 0 (k y2 ) 0 2k 0 1 / 2 , (7.179)
and the lowest cutoff frequency. As may be readily verified by the substitution to Eq. (176), the
eigenfunctions corresponding to this fundamental mode are also simple:
k 0 1 / 2 x 2
X 0 ( x) const exp , (7.180)
2
and similarly for Y0(y), so the field distribution follows the Gaussian function
k 0 1 / 2 2 2
f 0 ( ) f 0 (0) exp 2 , with a 1 / k 0 ,
1/ 2 1/ 4
f 0 ( 0 ) exp (7.181)
2 2a
where a >> 1/k(0) has the sense of the effective width of the field’s extension in the radial direction,
normal to the wave’s propagation axis z. This is the so-called Gaussian beam, very convenient for some
applications.
The Gaussian beam (181) is just one example of the so-called paraxial beams, which may be
represented as a result of modulation of a plane wave with a wave number k, by an axially-symmetric
envelope function f(), where {x, y}, with a relatively large effective radius a >> 1/k.71 Such beams
give me a convenient opportunity to deliver on the promise made in Sec. 1: calculate the angular
momentum L of a circularly polarized wave propagating in free space, and prove its fundamental
relation to the wave’s energy U. Let us start from the calculation of U for a paraxial beam (with an
Chapter 7 Page 51 of 70
Essential Graduate Physics EM: Classical Electrodynamics
arbitrary, but spatially localized envelope f) of a circularly polarized wave, with the transverse electric
field components given by Eq. (19):
where E0 is the real amplitude of the wave’s electric field at the propagation axis, kz – t + is its
total phase, and the two signs correspond to two possible directions of the circular polarization.72
According to Eq. (6), the corresponding transverse components of the magnetic field are
E E
H x 0 f sin , H y 0 f cos . (7.182b)
Z0 Z0
These expressions are sufficient to calculate the energy density (6.113) of the wave,73
0 E x2 E y2 0 H x2 H y2 0 E 02 f 2 0 E 02 f 2
u 0 E 02 f 2 , (7.183)
2 2 2 2 Z 02
and hence the full energy (per unit length in the direction z of the wave’s propagation) of the beam:
U ud r 2 ud 2 0 E
2 2
0 f
2
d . (7.184)
0 0
However, the transverse fields (182) are insufficient to calculate a non-zero average of L.
Indeed, following the angular moment’s definition in mechanics,74 L rp, where p is the particle’s
(linear) momentum, we may use Eq. (6.115) for the electromagnetic field momentum’s density g in free
space, to define the field’s angular momentum’s density as
1 1 EM field’s
l rg 2
r S 2 r E H . (7.185) angular
c c momentum
Let us use the familiar bac minus cab rule of the vector algebra75 to transform this expression to
1
l 2
Er H Hr E 12 n z E z r H H z r E E t r H H t r E. (7.186)
c c
If the field is purely transverse (Ez = Hz = 0), as it is in a strictly plane wave, the first square
brackets in the last expression vanish, while the second bracket gives an azimuthal component of l,
which oscillates in time and vanishes at its time averaging. (This is exactly the reason why I have not
tried to calculate L during our first discussion of the circularly polarized waves in Sec. 1.)
72 For our task of calculating two quadratic forms of the fields (L and U), their real representation (182) is more
convenient than the complex-exponent one. However, for linear manipulations, the latter representation of the
circularly polarized waves, Et = E0f()Re[(nx iny)exp{i}], Ht = (E0/Z0)f()Re[(inx + ny)exp{i}], is usually
more convenient, and is broadly used.
73 Note that, in contrast to a linearly-polarized wave (16), the energy density of a circularly-polarized wave does
not depend on the full phase – in particular, on t at fixed z, or vice versa. This is natural because its field vectors
rotate (keeping their magnitude) rather than oscillate – see Fig. 3b.
74 See, e.g., CM Eq. (1.31).
75 See, e.g., MA Eq. (7.5).
Chapter 7 Page 52 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Fortunately, our discussion of optical fibers, in particular, the derivation of Eqs. (167), (168), and
(170) gives us a clear clue on how to resolve this paradox. If the envelope function f() differs from a
constant, the transverse wave components (182) alone do not satisfy the Maxwell equations (2b), which
necessitates longitudinal components Ez and Hz of the fields, with76
E z E E y H z H x H y
x , . (7.187)
z x y z x y
However, as these expressions show, if the envelope function f changes very slowly in the sense df/d ~
f/a << kf, the longitudinal components are very small and do not have a back effect on the transverse
components. Hence, the above calculation of U is still valid (asymptotically, at ka 0), and we may
still use Eqs. (182) on the right-hand side of Eqs. (187),
E z f f H z E 0 f f
E 0 cos sin , sin cos , (7.188)
z x x z Z 0 x x
and integrate them over z as
f f E f f
E z E 0 cos sin dz 0 cos d sin d
x x k x x
(7.189a)
E f f
0 sin cos .
k x x
Here the integration constant is taken for zero because no wave field component may have a time-
independent part. Integrating, absolutely similarly, the second of Eqs. (188), we get
E0 f f
Hz cos sin . (7.189b)
kZ 0 x y
With the same approximation, we may calculate the longitudinal (z–) component of l, given by
the first term of Eq. (186), keeping only the dominating, transverse fields (182) in the scalar products:
l z E z r H t H z r E t E z xH x yH y H z xE x yE y . (7.190)
Plugging in Eqs. (182) and (189), and taking into account that in free space, k = /c, and hence 1/Z0c2k
= 0/, we get:
lz
0 E 02 f
xf
f E2
y 0 0 x
f2
y
f2
0 E 02
ρ f 2E
2 0
2
0
df 2
. (7.191)
x y 2 x y 2 d
Hence the total angular momentum of the beam (per unit length), is
0 E 02 2 d f 2 0 E 02
L z l z d 2 r 2 l z d
0
d f .
2 2
d (7.192)
0
d 0
Taking this integral by parts, with the assumption that f 0 at 0 and (as it is true for the
Gaussian beam (181) and all realistic paraxial beams), we finally get
76 The complex-exponential versions of these equalities are given by the bottom line of Eq. (100).
Chapter 7 Page 53 of 70
Essential Graduate Physics EM: Classical Electrodynamics
0 E 02 0 E 02
Lz
f 2 d 2 2
f
2
d . (7.193)
0 0
Now comparing this expression with Eq, (184), we see that remarkably, the ratio Lz/U does not
depend on the shape and the width of the beam (and of course on the wave’s amplitude E0), so these
parameters are very simply and universally related:
Angular
U momentum
Lz . (7.194) at circular
polarization
Since this relation is valid in the plane-wave limit a , it may be attributed to plane waves as well,
with the understanding that in reality, they always have some width (“aperture”) restriction.
As the reader certainly knows, in quantum mechanics the energy excitations of any harmonic
oscillator of frequency are quantized in the units of , while the internal angular momentum of a
particle is quantized in the units of s, where s is its spin. In this context, the classical relation (194) is
used in quantum electrodynamics as the basis for treating the electromagnetic field excitation quanta
(photons) as some sort of quantum particles with spin s = 1. (Such integer spin also fits the Bose-
Einstein statistics of the electromagnetic radiation.)
Unfortunately, I do not have time/space for a further discussion of the (very interesting) physics
of paraxial beams but cannot help noticing, at least in passing, the very curious effect of helical waves –
the beams carrying not only the “spin” momentum (194), but also an additional “orbital” angular
momentum. The distribution of their energy in space is not monotonic, as it is in the Gaussian beam
(181), but reminds several threads twisted around the propagation axis – hence the term “helical”.77
Mathematically, their field structure is described by the associate Laguerre polynomials – the same
special functions that are used for the quantum-mechanical description of hydrogen-like atoms.78
Presently, there are efforts to use such beams for the so-called orbital angular momentum (OAM)
multiplexing for high-rate information transmission.79
77 Noticing such solutions of the Maxwell equations may be traced back to at least a 1943 theoretical work by J.
Humblet; however, this issue had not been discussed in literature too much until experiments carried out in 1992 –
see, e.g. L. Allen et al., Optical Angular Momentum; IOP, 2003.
78 See, e.g., QM Sec. 3.7.
79 See, e.g., J. Wang et al., Nature Photonics 6, 488 (2012).
Chapter 7 Page 54 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Conceptually the simplest resonant cavity is the Fabry-Pérot interferometer80 that may be
obtained by placing two well-conducting planes parallel to each other.81 Indeed, in Sec. 3 we have seen
that if a plane wave is normally incident on such a “perfect mirror”, located at z = 0, its reflection, at
negligible skin depth, results in a standing wave described by Eq. (61b):
E ( z, t ) Re 2 E e it i / 2 sin kz . (7.195)
This wave would not change if we place the second mirror (isolating the segment of length l from the
external wave source) at any position z = l with sin kl = 0, i.e. with
kl p , where p 1, 2,.... (7.196)
This condition, which determines the spectrum of own (or resonance, or eigen-) frequencies of the
resonator of fixed length l,
v 1
p vk p p, with v , (7.197)
a 1 / 2
has a simple physical sense: the resonator’s length l equals exactly p half-waves of the frequency p.
Though this is all very simple, please note a considerable change of philosophy from what we have been
doing in the previous sections: the main task of the resonator’s analysis is finding its own frequencies
p, which are now determined by the system’s geometry rather than by an external wave source.
Before we move to cavities of more complex shapes, let us use Eq. (62) to represent the
magnetic field in the Fabry-Pérot interferometer:
E
H ( z, t ) Re 2 e it cos kz . (7.198)
Z
Expressions (195) and (198) show that in contrast to traveling waves, each field of the standing wave
changes simultaneously (proportionately) at all points of the Fabry-Pérot resonator, turning to zero
everywhere twice a period. At these instants, the energy of the corresponding field vanishes, but the
total energy of the two fields stays constant because the counterpart field oscillates with the phase shift
/2. Such behavior is typical for all electromagnetic resonators.
A more technical remark is that we can readily get the same results (195)-(198) by solving the
Maxwell equations from scratch. For example, we already know that in the absence of dispersion,
losses, and sources, they are reduced to wave equations (3) for any field components. For the Fabry-
Pérot resonator’s analysis, we can use the 1D form of these equations, say, for the transverse component
of the electric field:
2 1 2
2 2 2 E 0, (7.199)
z v t
and solve it as a part of an eigenvalue problem with the corresponding boundary conditions. Indeed, by
separating time and space variables as E(z, t) = Z(z)T(t), we obtain
80 This device, named after its inventors, Charles Fabry and Alfred Pérot; is also called the Fabry-Pérot etalon
(meaning “gauge”), because of its initial usage for light wavelength measurements.
81 The resonators formed by well-conducting (usually, metallic) walls are frequently called resonant cavities.
Chapter 7 Page 55 of 70
Essential Graduate Physics EM: Classical Electrodynamics
1 d 2 Z 1 1 d 2T
0. (7.200)
Z dz 2 v 2 T dt 2
Calling the separation constant k2, we get two similar ordinary differential equations,
d 2Z d 2T
2
k 2 Z 0, 2
k 2 v 2T 0, (7.201)
dz dt
both with sinusoidal solutions, so the product Z(z)T(t) is a standing wave with the wave vector k and
frequency = kv. (In this form, the equations are valid even in the presence of dispersion, but with a
frequency-dependent wave speed: v2 = 1/()().) Now using the boundary conditions E(0, t) = E(l, t)
= 0,82 we get the eigenvalue spectrum for kp and hence for p = vkp, given by Eqs. (196) and (197).
Lessons from this simple case study may be readily generalized to any cavity formed as a
transmission line’s section:83 there are two approaches to finding the resonant frequency spectrum:
(i) We may look at a traveling wave solution and find where reflecting mirrors may be inserted
without affecting the wave’s structure.
(ii) We may solve the general 3D wave equations,
2 1 2
2 2 f (r, t ) 0 , (7.202)
v t
for field components, as an eigenvalue problem with appropriate boundary conditions. If the system’s
parameters (and hence the coefficient v) do not change in time, the spatial and temporal variables of Eq.
(202) may be always separated by taking
f (r, t ) fk (r )T k (t ) , (7.203)
k
where each function Tk(t) always obeys the same equation as in Eq. (201), having the sinusoidal solution
of frequency k = vk. Plugging this solution back into Eqs. (202), for the spatial distribution of the field,
we get the 3D Helmholtz equation,
3D
2
k 2 fk (r ) 0 , (7.204) Helmholtz
equation
whose eigenfunctions fk(r) may be much more involved, especially for non-symmetric geometries.
Let us use these approaches to find the resonant frequency spectrum of a few simple, but
practically important cavities. First of all, the first method is completely sufficient for the analysis of any
resonator formed as a fragment of a uniform TEM transmission line (e.g., a coaxial cable), confined
with two conducting lids normal to the line’s direction. Indeed, since in such lines kz = k = /v, and the
electric field is perpendicular to the propagation axis, e.g., parallel to the lid surface, the boundary
conditions are exactly the same as in the Fabry-Pérot resonator, and we again arrive at the
eigenfrequency spectrum (197).
82 This is of course the expression of the first of the general boundary conditions (104). The second of these
conditions (for the magnetic field) is satisfied automatically for the transverse waves we are considering.
83 The resonators may have different geometries as well, and in many cases, only the second approach may be
used.
Chapter 7 Page 56 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Now let us analyze a slightly more complex system: a rectangular metallic-wall cavity of volume
abl – see Fig. 29. To use the first approach outlined above, let us consider the resonator as a finite-
length (z = l) section of the rectangular waveguide extended along the z-axis, which was analyzed in
detail in Sec. 6. As a reminder, at a < b, in the fundamental H10 traveling wave mode, both vectors E and
H do not depend on y, with E having only a y-component. In contrast, H has two components, Hx and
Hz, with the phase shift /2 between them, and with Hx having the same phase as Ey – see Eqs. (131),
(137), and (138). Hence, if a plane perpendicular to the z-axis, is placed so that the electric field
vanishes on it, Hx also vanishes, so both boundary conditions (104), pertinent to a perfect metallic wall,
are fulfilled simultaneously.
y
b
As a result, the H10 wave would not be perturbed by two metallic walls separated by an integer
number of half-wavelengths z/2 corresponding to the wave number given by the combination of Eqs.
(102) and (133):
2 2
k z k 2 k t2
1/ 2
2 2 . (7.205)
v a
Using this expression, we see that the smallest of these distances, l = z/2 = /kz, gives the resonance
frequency84
1/ 2
2 2
101 v , (7.206)
a l
where the indices of show the numbers of half-waves along each dimension of the system, in the order
[a, b, l]. This is the lowest (“fundamental”) frequency of the resonator (if b < a, l).
The field distribution in this mode is close to that in the corresponding waveguide mode H10
(Fig. 22), with the important difference that the magnetic and electric fields are now shifted by phase /2
both in space and time, just as in the Fabry-Pérot resonator – see Eqs. (195) and (198). Such a time shift
allows for a very simple interpretation of the H101 mode, which is especially adequate for very flat
resonators, with b << a, l. At the instant when the electric field reaches its maximum (Fig. 30a), i.e.
when the magnetic field vanishes in the whole volume, the surface electric charge of the broadest (in
Fig. 30, horizontal) walls of the resonator is largest, being localized mostly near the centers of the walls.
At the immediate later times, the walls start to recharge via surface currents, whose density J is largest
in the side walls, and reaches its maximal value in a quarter of the oscillation period T = 2/101 – see
Fig. 30b. The currents generate the vortex magnetic field, with looped field lines in the plane of the
84 In most electrical engineering handbooks, the index corresponding to the shortest side of the resonator is listed
last, so the fundamental mode is nominated as H110 and its eigenfrequency as 110.
Chapter 7 Page 57 of 70
Essential Graduate Physics EM: Classical Electrodynamics
broadest face of the resonator. The surface currents continue to flow in this direction until (in one more
quarter period) the broader walls of the resonator are fully recharged in the polarity opposite to that
shown in Fig. 30a. After that, the surface currents start to flow in the direction opposite to that shown in
Fig. 30b. This process, which repeats again and again, is conceptually similar to the well-known
oscillations in a lumped LC circuit, with the role of (now, distributed) capacitance played mostly by the
broadest walls of the resonator, and that of (now, distributed) inductance, by its narrower walls.
(a) (b)
In order to generalize Eq. (206) to higher oscillation modes, the second of the approaches
discussed above is more prudent. Separating the variables in the Helmholtz equation (204) as R(r) =
X(x)Y(y)Z(z), we see that X, Y, and Z have to be either sinusoidal or cosinusoidal functions of their
arguments, with the wave vector components satisfying the characteristic equation
2
k x2 k y2 k z2 k 2 . (7.207)
v2
In contrast to the wave propagation problem, now we are dealing with standing waves along all three
dimensions, and have to satisfy the macroscopic boundary conditions (104) on all sets of parallel walls.
It is straightforward to check that these conditions (E = 0, Hn = 0) are fulfilled at the following field
component distribution:
E x E1 cos k x x sin k y y sin k z z , H x H 1 sin k x x cos k y y cos k z z ,
E y E 2 sin k x x cos k y y sin k z z , H y H 2 cos k x x sin k y y cos k z z , (7.208)
E z E3 sin k x x sin k y y cos k z z , H z H 3 cos k x x cos k y y sin k z z ,
with each of the wave vector components having an equidistant spectrum similar to Eq. (196):
n m p
kx , ky , kz , (7.209)
a b l
so the full spectrum of resonance frequencies is given by the following formula:
1/ 2
n 2 m 2 p 2
nmp vk v , (7.210)
a b l
which is a natural generalization of Eq. (206). Note, however, that of the three integers m, n, and p, at
least two have to be different from zero to keep the fields (208) from vanishing at all points.
We may use Eq. (210), in particular, to evaluate the number of different modes in a relatively
small range d3k << k3 of the wave vector space volume that is, on the other hand, much larger than the
reciprocal volume, 1/V = 1/abl, of the cavity. Taking into account that each eigenfrequency (210), with
Chapter 7 Page 58 of 70
Essential Graduate Physics EM: Classical Electrodynamics
nml 0, corresponds to two field modes with different polarizations,85 the argumentation absolutely
similar to the one used for the 2D case at the end of Sec. 7, yields
Oscillation
d 3k
mode dN 2V . (7.211)
density (2 ) 3
This property, valid for resonators of arbitrary shape, is broadly used in classical and quantum statistical
physics,86 in the following form. If some electromagnetic mode functional f(k) is a smooth function of
the wave vector k, and the volume V is large enough, then Eq. (211) may be used to approximate the
sum of the functional’s values over the modes by an integral:
dN 3 V
f (k ) f (k )dN f (k ) d
k
3
k
d k2
2 3 f (k )d
3
k. (7.212)
N k k
Leaving similar analyses of resonant cavities of some other simple shapes for the reader’s
exercises, let me finish this section by noting that low-loss resonators may be also formed by finite-
length sections of not only metallic-wall waveguides of various cross-sections but also of dielectric
waveguides. Moreover, even a simple slab of a dielectric material with a / ratio substantially different
from that of its environment (say, of the free space) may be used as a high-Q Fabry-Pérot interferometer
(Fig. 31), due to an effective wave reflection from its surfaces at the normal and especially an inclined
incidence – see, respectively, Eqs. (68), and Eqs. (91) and (95).
0
d ~
Actually, such dielectric Fabry-Pérot interferometers are frequently more convenient for
practical purposes than metallic-wall resonators, not only due to possibly lower losses (especially in the
optical range) but also due to a natural coupling to the environment, which offers a ready way of wave
insertion and extraction – see Fig. 31 again. The backside of the same medal is that this coupling to the
environment provides an additional mechanism of power losses, limiting the resonance’s quality factor –
see the next section.
85 This fact becomes evident from plugging Eqs. (208) into the Maxwell equation E = 0. The resulting
equation, kxE1 + kyE2 + kzE3 =0, with the discrete, equidistant spectrum (209) for each wave vector component,
may be always satisfied by two linearly independent sets of the constants E1,2,3.
86 See, e.g., QM Sec. 1.1 and SM Sec. 2.6.
Chapter 7 Page 59 of 70
Essential Graduate Physics EM: Classical Electrodynamics
attenuation of the wave, i.e. to a decrease of its amplitude, and hence its power P, with the growing
distance z from the source. In linear materials, the power losses are proportional to the power P carried
by the wave, so the energy balance on a small segment dz takes the form
dP
dPloss dP dz Pdz . (7.213)
dz
The coefficient participating in the last form of Eq. (213), and hence defined as
dP / dz
, (7.214)
P
is called the attenuation constant.87 Comparing the solution of Eq. (213),
with Eq. (29), where k is replaced with kz, we see that may be expressed as
2 Im k z , (7.216)
where kz is the component of the wave vector along the transmission line. In the most important limit
when the losses are low in the sense << kz Re kz, its effects on the field distribution along the
line’s cross-section are negligible, making the calculation of rather straightforward. In particular, in
this limit, the contributions to attenuation from two major sources, the energy losses in the filling
dielectric and the skin-effect losses in conducting walls, are independent and additive.
The dielectric losses are especially simple to describe. Indeed, a review of our calculations in
Secs. 5-7 shows that all of them remain valid if either (), or (), or both, and hence k(), have small
imaginary parts:
k" Im 1 / 2 1 / 2 k' . (7.217)
For dielectric waveguides, in particular optical fibers, these losses are the main attenuation mechanism.
As was discussed in Sec. 7, in practical optical fibers tR >> 1, i.e. most of the field propagates (as an
evanescent wave) in the cladding, with a field distribution very close to the TEM wave. This is why Eq.
(218) is approximately valid if it is applied to the cladding material alone. In waveguides with non-
TEM waves, we can use the relations between kz and k, derived in the previous sections, to re-calculate
k” into Im kz. (Note that at this recalculation, the values of kt have to be kept real, because they are just
the eigenvalues of the Helmholtz equation (101), which does not include the filling media parameters.).
87 In engineering, wave attenuation is most frequently measured in decibels per meter, abbreviated as db/m (the
term not to be confused with dBm standing for decibel-milliwatt):
1/ m
db/m 10 log10
P ( z)
P ( z 1 m)
10 log10 e
10
ln 10
m 1 4.343 m 1 .
Alternatively, it is sometimes measured in nepers per meter (Np/m) defined as np/m /2, so db/m 8.686 Np/m.
Chapter 7 Page 60 of 70
Essential Graduate Physics EM: Classical Electrodynamics
In transmission lines and waveguides and with metallic walls, higher energy losses may come
from the skin effect. If the wavelength is much larger than s, as it usually is,88 the losses may be
readily evaluated using Eq. (6.36):
dPloss dP s
H wall
2
, (7.219)
dA dA 4
where Hwall is the real amplitude of the tangential component of the magnetic field at the wall’s surface.
The total power loss Ploss/dz per unit length of a waveguide, i.e. the right-hand side of Eq. (213), now
may be calculated by the integration of this expression along the contour(s) limiting the cross-section of
all conducting walls. Since our calculation is only valid for low losses, we may ignore their effect on the
field distribution, so the unperturbed distributions may be used both in Eq. (219), i.e. in the numerator of
Eq. (214), and also for the calculation of the average propagating power, i.e. the denominator of Eq.
(214) – as the integral of the Poynting vector over the cross-section of the waveguide.
Let us see how this approach works for the TEM mode in one of the simplest transmission lines,
the coaxial cable (Fig. 20). As we already know from Sec. 5, in the coarse-grain approximation,
implying negligible power loss, the TEM mode field distributions between the two conductors are the
same as in statics, namely:
a
H z 0, H 0, H ( ) H 0 , (7.220)
where H0 is the field’s amplitude on the surface of the inner conductor, and
1/ 2
a
E z 0, E ( ) ZH ( ) ZH 0 , E 0, where Z . (7.221)
Neglecting the power losses for a minute, we may plug these expressions into Eq. (42) to calculate the
time-averaged Poynting vector:
2
Z H ( )
2 2
Z H0 a
S , (7.222)
2 2
and from it, the total wave power flow through the cross-section:
2
Z H0 a2 b
d b
P Sd r 2
2
2
Z H 0 a 2 ln . (7.223)
A
2 a 2
a
Next, for this particular system (Fig. 20), the contours limiting the wall cross-section are circles
of radii = a (where the surface field amplitude Hwalls equals, in our notation, H0), and = b (where,
according to Eq. (214), the field is a factor of b/a lower). As a result, for the power loss per unit length,
Eq. (219) yields
dPloss dPloss a 0 s a
2
2
dl 2a H 0 2b H 0
2
a 1 s H 0 . (7.224)
dz
Ca Cb dA b 4 2 b
88As follows from Eq. (78), which may be used for crude estimates even in cases of arbitrary wave incidence, this
condition is necessary for low attenuation: << k only if f << 1.
Chapter 7 Page 61 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Note that at a << b, the losses in the inner conductor dominate, despite its smaller surface, because of
the higher surface field.
Now we may plug Eqs. (223) and (224) into the definition (214) of , to calculate the skin-effect
contribution to the attenuation constant:
dPloss / dz 1 1 1 s k s 1 1
skin . (7.225)
P 2 ln(b / a) a b Z 2 ln(b / a ) a b
This result shows that the relative (dimensionless) attenuation, /k, scales approximately as the ratio
s/min[a, b], in a semi-quantitative agreement with the plane-wave result (78).
Let us use this result to evaluate for the standard TV cable RG-6/U, with copper conductors of
diameters 2a = 1 mm, 2b = 4.7 mm, and 2.20 and 0. According to Eq. (6.33), for f = 100 MHz
(i.e. 6.3108 s-1) the skin depth of pure copper at room temperature (with 6.0107 S/m) is close
to 6.510-6 m, while k = ()1/2 = (/0)1/2(/c) 3.1 m–1. As a result, the attenuation is rather low: skin
0.016 m-1, so the attenuation length scale ld 1/ is about 60 m. Hence the attenuation in a cable
connecting a roof TV antenna or a cable distribution box to a TV set is not a big problem, though using
a worse conductor, e.g., steel, would make the losses rather noticeable. (Hence the current worldwide
shortage of copper.) However, the use of such cable in the X-band (f ~ 10 GHz) is more problematic.
Indeed, though the skin depth s –1/2 decreases with frequency, the wavelength drops, i.e. k
increases, even faster (k ), so the attenuation skin 1/2 becomes close to 0.16 m–1, i.e. ld to ~6 m.
This is why at such frequencies, it may be necessary to use rectangular waveguides, with their larger
internal dimensions a, b ~ 1/k, and hence lower attenuation. Let me leave the calculation of this
attenuation, using Eq. (219) and the results derived in Sec. 7, for the reader’s exercise.
The main effect of dissipation on free oscillations in resonators is different: here it leads to a
gradual decay of the oscillating fields’ energy U in time. A useful dimensionless measure of this decay,
called the Q factor, is commonly defined by writing the following temporal analog of Eq. (213):89
dU Ploss dt Udt , (7.226)
Q
where in the resonance frequency in the loss-free limit, and
Ploss
. (7.227) Q-factor
Q U
The solution of Eq. (226),
Q Q / 2 QT
U (t ) U (0)e t / , with , (7.228)
/ 2 2
which is the temporal analog of Eq. (215), shows the physical meaning of the Q-factor: the characteristic
time of the oscillation energy’s decay is (Q/2) times longer than the oscillation period T = 2/.
(Another useful interpretation of Q comes from the universal relation90
89 As losses grow, the oscillation waveform deviates from the sinusoidal one, and the very notion of “oscillation
frequency” becomes vague. As a result, the parameter Q is well-defined only if it is much higher than 1.
90 See, e.g., CM Sec. 5.1.
Chapter 7 Page 62 of 70
Essential Graduate Physics EM: Classical Electrodynamics
Q , (7.229)
where is the so-called FWHM91 bandwidth of the resonance, namely the difference between the two
values of the external signal frequency, one above and one below , at which the energy of the
oscillations induced in the resonator by an input signal is twice lower than its resonance value.)
In the important particular case of a resonant cavity formed by the insertion of metallic walls into
a TEM transmission line of a small cross-section (with the linear size scale a much less than the
wavelength ), there is no need to calculate the Q-factor directly, provided that the line attenuation
coefficient is already known. In fact, as was discussed in Sec. 8 above, the standing waves in such a
resonator, of the length given by Eq. (196): l = p(/2) with p = 1, 2,…, may be understood as an overlap
of two TEM waves propagating in opposite directions, or in other words, a traveling wave plus its
reflection from one of the ends, the whole roundtrip taking time t = 2l/v = p/v = 2p/ = pT.
According to Eq. (215), at this distance, the wave’s power drops by a factor of exp{-2l} = exp{-p}.
On the other hand, the same decay may be viewed as taking place in time, and according to Eq. (228),
results in the drop by a factor of exp{-t/} = exp{-(pT )/(Q/)} = exp{-2p/Q}. Comparing these two
exponents, we get
2 k
Q vs.
Q . (7.230)
This simple relation neglects the losses at the wave reflection from the walls limiting the
resonator length. This approximation is indeed legitimate at l >> ; if this relation is violated, or if we
are dealing with more complex resonator modes (such as those based on the reflection of E or H waves),
the Q-factor may be different from that given by Eq. (230), and needs to be calculated directly from Eq.
(227). A substantial relief for such a direct calculation is that, just at the calculation of small attenuation
in waveguides, in the low-loss limit (Q >> 1), both the numerator and denominator of the right-hand
side of that formula may be calculated neglecting the effects of the energy loss on the field distribution
in the resonator. I am leaving such a calculation, for a few simple resonant cavities, including the
rectangular and the circular ones, for the reader’s exercise.
To conclude this chapter, the last remark: in some distributed resonators (including certain
dielectric resonators92 and metallic cavities with holes in their walls), additional energy losses due to the
wave radiation into the environment are also possible. In some simple cases (say, the Fabry-Pérot
interferometer shown in Fig. 31), the calculation of these radiative losses is straightforward, but
sometimes it requires more elaborated approaches that will be discussed in the next chapter.
Chapter 7 Page 63 of 70
Essential Graduate Physics EM: Classical Electrodynamics
7.2. The electric polarization of some material responds to an electric field step94 in the
following way:
Pt 1 E0 1 e t / ,
0,
if E t E0
for t 0,
for 0 t ,
1,
where > 0 and 1 are some constants. Calculate the complex permittivity () of this material, and
discuss a possible simple physical model giving such dielectric response.
7.3. Calculate the complex permittivity () of a material whose dielectric-response Green’s
function defined by Eq. (23), is
G G0 1 e / ,
with some positive constants G0 and . What is the difference between this dielectric response and the
apparently similar one considered in the previous problem?
7.4. Use the oscillator model of an atom, given by Eq. (30), to calculate its average potential
energy in a uniform, sinusoidal ac electric field, and use the result to calculate the potential profile
created for the atom by a standing electromagnetic wave with the electric field amplitude E(r).
7.5. The solution of the previous problem shows that a standing electromagnetic wave may exert
a time-averaged force on an otherwise free non-relativistic charged particle. Reveal the physics of this
force by writing and solving the equations of motion of such a particle in:
(i) a linearly-polarized monochromatic plane traveling wave, and
(ii) a similar but standing wave.
7.6. Use the first of Eqs. (54) to relate the integral " d to the plasma frequency for the
0
7.7. Prove that Eq. (6.42) cannot be correct for all frequencies, and suggest its correction making
the result compatible with both the causality principle and the physical model (6.39).
7.8. Calculate, sketch, and discuss the dispersion relation for electromagnetic waves propagating
in a medium described by the Lorentz oscillator model (32), for the case of negligible damping.
7.9. As was briefly discussed in Sec. 2,95 a wave pulse of a finite but relatively large spatial
extension z >> 2/k may be formed as a wave packet – a sum of sinusoidal waves with wave
Chapter 7 Page 64 of 70
Essential Graduate Physics EM: Classical Electrodynamics
vectors k within a relatively narrow interval. Consider an electromagnetic plane wave packet of this
type, with the electric field distribution
E(r, t ) Re E k e
i kz k t dk , with k k k k ,
1/ 2
propagating along the z-axis in an isotropic, linear, and dissipation-free (but not necessarily dispersion-
free) medium. Express the full energy of the packet (per unit area of the wave’s front) via the complex
amplitudes Ek, and discuss its dependence on time.
7.10. Prove the Lorentz reciprocity relation (6.121) for a linear isotropic medium.
7.11.* A plane wave of frequency is normally incident, from free space, on a plane surface of a
collision-free plasma with the electron density growing linearly and slowly with the distance from the
surface: n = z for z 0, where > 0 is a small constant. Calculate the functional form of the resulting
standing wave’s “tail” inside the plasma.
7.12.* Analyze the effect of a time-independent uniform magnetic field B0, parallel to the
direction n of an electromagnetic wave propagation, on the wave’s dispersion in plasma, within the
same simple model that was used in Sec. 2 for the derivation of Eq. (38). (Limit your analysis to
relatively weak waves, whose magnetic field is much smaller than B0.)
Hint: You may like to represent the incident wave as a linear superposition of two circularly
polarized waves, with opposite polarization directions.
7.13. A monochromatic plane electromagnetic wave is normally incident, from free space, on a
uniform slab with electric permittivity and magnetic permeability , with the slab’s thickness d
comparable with the wavelength.
(i) Calculate the power transmission coefficient T, i.e. the fraction of the incident wave’s power,
that is transmitted through the slab.
(ii) Assuming that and are frequency-independent and positive, analyze in detail the
frequency dependence of T. In particular, how does the function T () depend on the slab’s thickness d
and the wave impedance Z (/)1/2 of its material?
7.14. A plane electromagnetic wave with a free-space wave number k0 is normally incident on a
planar conducting film of thickness d ~ s << 1/k0. Calculate the power transmission coefficient of the
system and analyze the result in the limits of small and large values of the ratio d/s.
7.15. One of the results of the previous problem’s solution was the following expression for the
coefficient of power transmission of a plane electromagnetic wave through a thin conducting film of
thickness d << s, , at normal incidence:
1
T ,
1 Z 0 / 2R 2
Chapter 7 Page 65 of 70
Essential Graduate Physics EM: Classical Electrodynamics
where R 1/d is the sheet resistance (“resistance per square”) of the film. Derive this formula in a
simpler way, utilizing the smallness of d from the very beginning. Also, calculate the power reflection
coefficient R, compare it with T, and comment.
0 , 0 , ' , '
7.16. A plane wave of frequency is normally incident, from
free space, on a plane surface of a material with real electric
permittivity ’ and magnetic permeability ’. To minimize the wave’s
reflection from the surface, it may be covered with a layer, of thickness
d, of another transparent material – see the figure on the right.
Calculate the optimal values of , , and d. d
7.17. A monochromatic plane wave is incident from inside a medium with > 00 on its planar
surface, at an incidence angle larger than the critical angle c = sin–1(00/)1/2. Calculate the depth
of the evanescent wave penetration into the free space, and analyze its dependence on . Does the result
depend on the wave’s polarization?
7.18. Calculate the critical angle c for a wave of frequency , incident from free space upon a
planar surface of a plasma with electron density n, and discuss the implications of the result for
ultraviolet and X-ray optics.
7.19. Analyze the possibility of propagation of surface electromagnetic waves along a planar
boundary between plasma and free space. In particular, calculate and analyze the dispersion relation of
the waves.
Hint: Assume that the magnetic field of the wave is parallel to the boundary and perpendicular to
the wave’s propagation direction. (After solving the problem, justify this mode choice.)
z
7.20. Light from a very distant source arrives to an observer through a
plane layer of nonuniform medium with a certain 1D gradient of its refraction i
index n(z), at angle 0 – see the figure on the right. What is the genuine direction
i to the source, if n(z) 1 at z ? (This problem is evidently important for n z 0
high-precision astronomical measurements at the Earth’s surface.)
7.21. Calculate the TEM impedance ZW of uniform TEM transmission lines with well-conducting
electrodes and the cross-sections shown in the figure below:
Chapter 7 Page 66 of 70
Essential Graduate Physics EM: Classical Electrodynamics
7.22. Modify the solution of Task (ii) of the previous problem for a superconductor microstrip
line, taking into account the magnetic field’s penetration into both the strip and the ground plane.
7.23.* What lumped ac circuit would be equivalent to the TEM-line system shown in Fig. 19,
with an incident wave’s power Pi? Assume that the wave reflected from the lumped load circuit does
not return to it.
7.25. Represent the fundamental H10 wave in a rectangular waveguide (Fig. 22) with a sum of
two plane waves, and discuss the physics behind such a representation.
7.26.* For the coaxial cable (see, e.g., Fig. 20), find the lowest non-TEM mode and calculate its
cutoff frequency.
7.29. Prove that TEM-like waves may propagate, in the radial direction, in the 0
free space between two coaxial, round, well-conducting cones – see the figure on the
right. Can this system be characterized by a certain transmission line impedance ZW, as
defined by Eq. (115)?
Chapter 7 Page 67 of 70
Essential Graduate Physics EM: Classical Electrodynamics
7.30. Use the recipe outlined in Sec. 7 to prove the characteristic equation (161) for the HE and
EH waves in step-index optical fibers with a round cross-section.
7.31. Derive an approximate equation describing spatial variations of the complex amplitude of a
general monochromatic paraxial beam propagating in a uniform medium, for the case when these
variations are sufficiently slow. Is the Gaussian beam described by Eq. (181) one of the possible
solutions of this equation? Give your interpretation of the last result.
z
7.32. Calculate the lowest resonance frequencies and the corresponding l
field distributions of standing electromagnetic waves inside a round cylindrical
cavity with well-conducting walls (see the figure on the right), neglecting the 0
skin depth s in comparison with l and R.
R
7.33. Analyze electromagnetic waves that may propagate inside a relatively narrow gap between
two well-conducting concentric spherical shells of radii R and R + d, in the limit d << R.
(i) Within the coarse-grain approximation, derive the 2D equation describing such waves with
relatively large wavelengths ~ R >> d.
(ii) Calculate the lowest resonance frequencies of the system.
7.35. A plane monochromatic wave propagates through a medium with an Ohmic conductivity
and negligible electric and magnetic polarization effects. Calculate the wave’s attenuation and relate the
result to a certain calculation carried out in Chapter 6.
7.36. Generalize the telegrapher’s equations (110)-(111) by accounting for small energy losses:
(i) in the transmission line’s conductors, and
(ii) in the medium separating the conductors,
using their simplest (Ohmic) models. Formulate the conditions of validity of the resulting equations.
7.37. Calculate the skin-effect contribution to the attenuation constant of a TEM wave in the
microstrip line discussed in Problem 21 (ii).
7.38. Calculate the skin-effect contribution to the attenuation coefficient defined by Eq. (214),
for the fundamental (H10) mode propagating in a waveguide with well-conducting walls, of a rectangular
cross-section – see Fig.22. Use the results to evaluate the wave decay length ld 1/ of a 10 GHz wave
in the standard X-band waveguide WR-90 (with copper walls, a = 23 mm, b = 10 mm, and no dielectric
filling), at room temperature. Compare the result with that (obtained in Sec. 9) for the standard TV
coaxial cable, at the same frequency.
Chapter 7 Page 68 of 70
Essential Graduate Physics EM: Classical Electrodynamics
7.40. For a rectangular cavity of dimensions abl, with b a, l, calculate the Q-factor of the
fundamental oscillation mode, due to the skin-effect losses in its conducting walls. Evaluate the factor
for a 232310 mm3 cavity with copper walls, at room temperature.
z
7.41.* Calculate the lowest eigenfrequency and the Q-factor (due to r
the skin-effect losses) of the axially symmetric toroidal cavity with well-
d
conducting walls and the interior’s cross-section shown in the figure on the 0 R
right, for the case d << r, R.96
7.42. Express the contribution to the damping coefficient (the reciprocal Q-factor) of a resonant
cavity, by small energy losses in the dielectric that fills it, via the complex functions () and () of
the material.
7.43. For the dielectric Fabry-Pérot resonator (Fig. 31) with the normal wave incidence, calculate
the Q-factor due to radiation losses, in the limit of a strong impedance mismatch (Z >> Z0), by using two
approaches:
(i) from the energy balance, using Eq. (227), and
(ii) from the frequency dependence of the power transmission coefficient, using Eq. (229).
Compare the results.
96 Such resonators are broadly used in particle accelerators and also in vacuum electron devices for high-power
microwave amplification and generation (e.g., the so-called klystrons), where the electric field has to be
concentrated in the region of charged particle passage – typically, along the symmetry axis (in the figure above,
the z-axis), through a pair of small holes in the cavity’s walls, which do not affect the field distribution
substantially.
Chapter 7 Page 69 of 70
Essential Graduate Physics EM: Classical Electrodynamics
This page is
intentionally left
blank
Chapter 7 Page 70 of 70
Essential Graduate Physics EM: Classical Electrodynamics
1 When necessary (e.g., at the discussion of the Cherenkov radiation in Sec. 10.5), it will be not too hard to
generalize these results to a dispersive medium.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
As we know, these expressions may be derived by, first, calculating the potential of a point source, and
then using the linear superposition principle for a system of such sources.
Let us do the same for the time-dependent case, starting from the field induced by a time-
dependent point charge at the origin:2
(r, t ) q(t ) (r ) , (8.5)
In this case, Eq. (3a) is homogeneous everywhere but the origin:
1 2
2 0, for r 0 . (8.6)
v 2 t 2
Due to the spherical symmetry of the problem, it is natural to look for a spherically symmetric solution
to this equation.3 Thus, we may simplify the Laplace operator correspondingly (as was repeatedly done
earlier in this course), so Eq. (6) becomes
1 2 1 2
2 r 2 2 0, for r 0 . (8.7)
r r r v t
By introducing a new variable (r, t) r(r, t), Eq. (7) is reduced to the 1D wave equation
2 1 2
2 2 2 0, for r 0 . (8.8)
r v t
From discussions in Chapter 7,4 we know that its general solution may be represented as
r r
(r , t ) out t in t , (8.9)
v v
where in and out are (so far) arbitrary functions of one variable. The physical sense of out = out/r is a
spherical wave propagating from our source (located at r = 0) to outer space, i.e. exactly the solution we
are looking for. On the other hand, in = in/r describes a spherical wave that could be created by some
distant spherically-symmetric source, that converged exactly on our charge located at the origin –
evidently not the effect we want to consider here. Discarding this term, and returning to = /r, we get
1 r
(r , t ) out t , for r 0 . (8.10)
r v
In order to calculate the function out, let us consider the solution (10) at distances r so small (r
<< vt) that the time-derivative term in Eq. (3a), with the right-hand side (5),
1 2 q(t )
2 (r ) , (8.11)
v t
2 2
2 Admittedly, this expression does not satisfy the continuity equation (4.5), but this deficiency will be corrected
imminently, at the linear superposition stage – see Eq. (17) below.
3 Let me emphasize that this is not the general solution to Eq. (6). For example, it does not describe the possible
waves created by other sources, that pass by the considered charge q(t). However, such fields are irrelevant to our
current task: to calculate the field induced by the charge q(t). The solution becomes general when it is integrated
(as it will be) over all relevant charges.
4 See also CM Sec. 6.3.
Chapter 8 Page 2 of 38
Essential Graduate Physics EM: Classical Electrodynamics
is much smaller than the spatial derivative term (which diverges at r 0) . Then Eq. (11) is reduced to
the Poisson equation, whose solution (4a), for the source (5), is
q(t )
(r 0, t ) . (8.12)
4 r
Now requiring the two solutions, Eqs. (10) and (12), to coincide at r << vt, we get out(t) = q(t)/4r, so
Eq. (10) becomes
1 r
(r , t ) q t . (8.13)
4 r v
Just as was repeatedly done in statics, this result may be readily generalized for the arbitrary
position r’ of the point charge:
(r, t ) q(t ) (r - r' ) q (t ) (R ) , (8.14)
where R is the distance between the field observation point r and the source position point r’, i.e. the
length of the vector,
R r r ', (8.15)
connecting these points – see Fig. 1.
r' r
a
0 Fig. 8.1. Calculating the retarded
n potentials of a localized source.
where the integration is extended over all charges of the system under analysis. Solving Eq. (4b)
absolutely similarly, for the vector potential we get5
Retarded R d 3 r'
4
vector A(r, t ) j r' , t . (8.17b)
potential v R
5 As should be clear from the analogy of Eqs. (17) with their stationary forms (4), which were discussed,
respectively, in Chapters 1 and 5, in the Gaussian units the retarded potential formulas are valid with the
coefficient 1/4 dropped in Eq. (17a), and replaced with 1/c in Eq. (17b).
Chapter 8 Page 3 of 38
Essential Graduate Physics EM: Classical Electrodynamics
(Now nothing prevents the functions (r, t) and j(r, t) from satisfying the continuity equation.)
The solutions expressed by Eqs. (17) are traditionally called the retarded potentials, the name
signifying the fact that the observed fields are “retarded” (in the meaning “delayed”) in time by t = R/v
relative to the source variations – physically, because of the finite speed v of the electromagnetic wave
propagation. Note that, very remarkably, these simple expressions are exact solutions of the
macroscopic Maxwell equations (again, in a uniform, linear, dispersion-free medium) for an arbitrary
distribution of stand-alone charges and currents. They also may be considered as the general solutions
of these equations, provided that the integration has been extended over all field sources in the Universe
– or at least over those ones that affect our observations.
Note also that due to the mathematical similarity of the microscopic and macroscopic Maxwell
equations, Eqs. (17) are valid, with the coefficient replacement 0 and 0, for the exact, rather
than the macroscopic fields, provided that the functions (r, t) and j(r, t) describe not only stand-alone
but all charges and currents in the system. (Alternatively, this statement may be formulated as the
validity of Eqs. (17), with the same coefficient replacement, in free space.)
Finally, note that Eqs. (17) may be plugged into Eqs. (1), giving (after an explicit differentiation)
the so-called Jefimenko equations6 for fields E and B – similar in structure to Eqs. (17), but more
cumbersome. Conceptually, the existence of such equations is good news, because they are free from the
gauge ambiguity pertinent to the potentials and A. However, the practical value of these explicit
expressions for the fields is not overly high: for all applications I am aware of, it is easier to use Eqs.
(17) to calculate the particular expressions for the potentials first, and only then calculate the fields from
Eqs. (1). Let me now present an (arguably, the most important) example of this approach.
to the scalar function f(R) R (for which f(r) = R = n, where n r/r is the unit vector directed
toward the observation point – see Fig. 1) to approximate the distance R as
R r r' n . (8.19)
In each of the retarded potential formulas (17), R participates in two places: in the denominator
and in the source’s time argument. If and j change in time on the scale ~1/, where is some
characteristic frequency, then any change of the argument (t – R/v) on that time scale, for example due
to a change of R on the spatial scale ~v/ = 1/k, may substantially change these functions. Thus, the
expansion (19) may be applied to R in the argument (t – R/v) only if ka << 1, i.e. if the system’s size a
6They were published by O. D. Jefimenko only in 1966, but the Fourier representation of the same result was
obtained much earlier (in 1912) by G. A. Scott.
Chapter 8 Page 4 of 38
Essential Graduate Physics EM: Classical Electrodynamics
is much smaller than the radiation wavelength = 2/k. On the other hand, the function 1/R changes
relatively slowly, and for it even the first term of the expansion (19) gives a good approximation as soon
as a << r, R. In the latter approximation alone, Eq. (17a) yields
1 R 1 R
4 r
(r, t ) r' , t d 3
r' Q t , (8.20)
v 4 r v
where Q(t) is the net electric charge of the localized system. Due to the charge conservation, this charge
cannot change with time, so the approximation (20) describes just a static Coulomb field of our
localized source, rather than a radiated wave.
Let us, however, apply a similar approximation to the vector potential (17b):
R
A(r, t )
4 r
j r' , t d 3 r' .
v
(8.21)
According to Eq. (5.87), the right-hand side of this expression vanishes in statics, but in dynamics, this
is no longer true. For example, if the current is due to some non-relativistic motion7 of a system of point
charges qk, we can write
d
j(r' , t )d r' k qk rk t dt k qk rk t p t ,
3
(8.22)
where p(t) is the dipole moment of the localized system, defined by Eq. (3.6). Now, after the integration,
we may keep only the first term of the approximation (19) in the argument (t – R/v) as well, getting
r 1
A(r, t ) p t , for a R, . (8.23)
4 r v k
Let us analyze what exactly this result describes. The second of Eqs. (1) allows us to calculate
the magnetic field by the spatial differentiation of A. At large distances r >> (i.e. in the so-called far-
field zone), where Eq. (23) describes a locally-plane wave, the dominating contribution to this derivative
is given by the dipole moment factor:
r
t .
Far-field r
wave B(r, t ) p t np (8.24)
4 r v 4 rv v
This expression means that the magnetic field, at the observation point, is perpendicular to the vectors n
and (the retarded value of) p
, and its magnitude is
r 1 r
B p t sin , i.e. H p t sin , (8.25)
4 rv v 4 rv v
where is the angle between those two vectors – see Fig. 2.8
7 For relativistic particles, moving with velocities of the order of speed of light, one has to be more careful. As the
result, I will postpone the discussion of their radiation until Chapter 10, i.e. until after the detailed discussion of
special relativity in Chapter 9.
8 From the first of Eqs. (1) for the electric field, in the first approximation (23), we would get –A/t = –(1/4vr)
(t – r/v) = –(Z/4r) p
p (t – r/v). The transverse component of this vector (see Fig. 2) is the proper electric field E
Chapter 8 Page 5 of 38
Essential Graduate Physics EM: Classical Electrodynamics
n
1
t
p
r H
A
c r
E
A Fig. 8.2. Far-fields of a localized source,
- contributing to its electric dipole radiation.
0 t
The most important feature of this result is that the time-dependent field decreases very slowly
(only as 1/r) with the distance from the source, so the radial component of the corresponding Poynting
vector (7.9b),9
2
Z r Instant
S r ZH 2
p t v sin ,
2
(8.26) power
(4 vr ) 2 density
drops only as 1/r2, i.e. the full instant power P of the emitted wave,
Z Z
P S r r d
2
p 2 2 sin 3 d p 2 . (8.27) Larmor
formula
4 (4 v) 2
0 6 v 2
does not depend on the distance from the source – as it should for radiation. 10
This is the famous Larmor formula11 for the electric dipole radiation; it is the dominating
component of radiation by a localized system of charges – unless p = 0. Please notice its angular
(where = 0), and reaches its
dependence: the radiation vanishes at the axis of the retarded vector p
maximum in the plane normal to that axis.
In order to find the average power, Eq. (27) has to be averaged over a sufficiently long time. In
particular, if the source is monochromatic, p(t) = Re[p exp{–it}], with a time-independent vector
amplitude p, such averaging may be carried out just over one period, giving an extra factor 2 in the
denominator:
Z 4 2
Average
P p . (8.28) radiation
12 v 2 power
The easiest application of this formula is to a point charge oscillating, with frequency , along a
straight line (which we may take for the z-axis), with amplitude a. In this case, p = qz(t)nz = qa
Re[exp{–it}]nz, and if the charge velocity amplitude, a, is much less than the electromagnetic wave’s
speed v, we may use Eq. (28) with p = qa, giving
= ZHn of the radiated wave, while its longitudinal component is exactly compensated by (–) in the next term
of the Taylor expansion of Eq. (17a) in small parameter ka ~ a/ << 1.
9 Note the “doughnut” dependence of S on the direction n, frequently used to visualize the dipole radiation.
r
10 In the Gaussian units, for free space (v = c), Eq. (27) reads P ( 2 / 3c 3 ) p 2 .
11 Named after Joseph Larmor, who was the first to derive this formula (in 1897) for the particular case of a single
Chapter 8 Page 6 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Zq 2 a 2 4
P . (8.29)
12 v 2
Applied to a classical picture of an electron (with q = –e 1.610–19C), initially rotating about an
atom’s nucleus at an atomic distance a ~ 10–10 m, Eq. (29) shows12 that the energy loss due to the dipole
radiation is so large that it would cause the electron to collapse on the nucleus in just ~10-11 s. In the
beginning of the 1900s, this result was one of the main arguments for the development of quantum
mechanics, which prevents such a collapse of electrons for their lowest-energy (ground) quantum state.
Another useful application of Eq. (28) is the radio wave radiation by a short, straight, symmetric
antenna which is fed, for example, by a TEM transmission line such as a coaxial cable – see Fig. 3.
z
l/2
I ( 0)
Pin
0 P
The exact solution of this problem is rather complicated because the law I(z) of the current
variation along the antenna’s length should be calculated self-consistently with the distribution of the
electromagnetic field induced by the current in the surrounding space. (Unfortunately, this fact is not
mentioned in some textbooks.) However, the current should be largest in the feeding point (in Fig. 3,
taken for z = 0) and vanish at the antenna’s ends (z = l/2), and hence we may guess that at l << , the
linear function
2
I ( z ) I (0)1 z , (8.30)
l
should be a good approximation of the actual distribution – as it indeed is. Now we can use the
continuity equation Q/t = I, i.e. –iQ = I, to calculate the complex amplitude Q(z) = iI(z)sgn(z)/
of the electric charge Q(z, t) = Re[Qexp{–it}] of the wire’s segment [0, z], and from it, the amplitude
of the charge’s linear density:
dQ ( z ) 2 I (0)
( z ) i sgn z . (8.31)
dz l
From here, the dipole moment’s amplitude is
l/2
I ( 0) ,
p 2 ( z ) zdz i l (8.32)
0
2
so Eq. (28) yields
12 Actually, the formula needs a numerical coefficient adjustment to account for the electron’s orbital (rather than
linear) motion – the task left for the reader’s exercise. However, this adjustment does not affect the order-of-
magnitude estimate given above.
Chapter 8 Page 7 of 38
Essential Graduate Physics EM: Classical Electrodynamics
2 2
4 I (0) 2 Z (kl ) 2 I (0)
P Z l , (8.33)
12 v 2 4 2 24 2
where k = /v. The analogy between this result and the dissipation power P = ReZ I 2/2 in a lumped
linear circuit element, enables the interpretation of the first fraction in the last form of Eq. (33) as the
real part of the antenna’s impedance:
(kl ) 2
Re Z A Z , (8.34)
24
as felt by the transmission line.
According to Eq. (7.118), the wave traveling along the line toward the antenna is fully radiated,
i.e. not reflected back, only if ZA equals to ZW of the line. As we know from Sec. 7.5 (and the solution of
the related problems), for typical TEM lines, ZW ~ Z0, while Eq. (34), which is only valid in the limit kl
<< 1, shows that for the radiation into free space (Z = Z0), ReZA is much less than Z0. Hence to reach the
impedance matching condition ZW = ZA, the antenna’s length should be increased – as a more involved
theory shows, to l /2. However, in many cases, practical considerations make short antennas
necessary. The example most often met nowadays is the cell phone antennas, which use frequencies
close to 1 or 2 GHz, with free-space wavelengths between 15 and 30 cm, i.e. much larger than the
phone size.13 The quadratic dependence of the antenna’s efficiency on l, following from Eq. (34),
explains why every millimeter counts in the design of such antennas, and why the designs are carefully
optimized using software packages for the (virtually exact) numerical solution of the Maxwell equations
for the specific shape of the antenna and other phone parts.14
To conclude this section, let me note that if the wave source is not monochromatic, so p(t)
should be represented as a Fourier series,
p(t ) Re p e it , (8.35)
the terms corresponding to the interference of spectral components with different frequencies are
averaged out at the time averaging of the Poynting vector, and the average radiated power is just a sum
of contributions (28) from all substantial frequency components.
13 The situation will be partly remedied by the planned transfer of wireless mobile technology to the next
generations, with the signal frequencies gradually moving up.
14 A partial list of popular software packages of this kind includes both publicly available codes such as Nec2
(whose various versions are available online, e.g., at http://www.qsl.net/4nec2/), and proprietary packages – such
as Momentum from Agilent Technologies (now owned by Hewlett-Packard), FEKO from EM Software &
Systems, and XFdtd from Remcom.
15 Named after Max Born, one of the founding fathers of quantum mechanics. However, the basic idea of this
approach was developed much earlier (in 1881) by Lord Rayleigh – born John William Strutt.
Chapter 8 Page 8 of 38
Essential Graduate Physics EM: Classical Electrodynamics
incident scattered
wave wave
scattering
object Fig. 8.4. Wave scattering (schematically).
As the first example of this approach, let us consider the scattering of a plane wave, propagating
in free space (Z = Z0, v = c), by an otherwise free16 charged particle whose motion may be described by
non-relativistic classical mechanics. (This requires, in particular, the incident wave to be not too
powerful, so the speed of the charge’s motion induced by the wave remains much lower than c.) As was
already discussed at the derivation of Eq. (7.32), in this case, the magnetic component of the Lorentz
force (5.10) is negligible in comparison with the force Fe = qE exerted by the wave’s electric field.
Thus, assuming that the incident wave is linearly polarized along some axis x, the equation of the
particle’s motion in the Born approximation is just m x = qE(t), so for the x-component px = qx of its
dipole moment we can write
q2
p qx E (t ) . (8.36)
m
As we already know from Sec. 2, oscillations of the dipole moment lead to radiation of a wave with a
wide angular distribution of intensity; in our case, this is the scattered wave – see Fig. 4. Its full power
may be found by plugging Eq. (36) into Eq. (27):
Z0 Z0q 4
P p
2
E 2 (t ) , (8.37)
6 c 2 6 c 2 m 2
so for the average power, we get
Z0q4 2
P E . (8.38)
12 c m
2 2
Since this power is proportional to the incident wave’s intensity S, it is customary to characterize
the scattering ability of an object by the ratio,
Full
P P
cross-
section:
2
, (8.39)
definition
S incident E / 2 Z 0
which has the dimensionality of area, and is called the total cross-section of scattering.17 For this
measure, Eq. (38) yields the famous result
16As Eq. (7.30) shows, this calculation is also valid for an oscillator with a low own frequency, 0 << .
17 This definition parallels those accepted in the classical and quantum theories of particle scattering – see, e.g.,
respectively, CM Sec. 3.5 and QM Sec. 3.3.
Chapter 8 Page 9 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Z 02 q 4 02 q 4
, (8.40)
6 c 2 m 2 6 m 2
which is called the Thomson scattering formula,18 especially when applied to an electron. This relation
is most frequently represented in the form19
8 2 q2 1 Thomson
rc , with rc 2 . (8.41) scattering
3 4 0 mc
This constant rc is called the classical radius of the particle (or sometimes the “Thomson scattering
length”); for the electron (q = –e, m = me) it is close to 2.8210-15 m. Its possible interpretation is
evident from Eq. (41) for rc: at that distance between two similar particles, the potential energy q2/40r
of their electrostatic interaction is equal to the particle’s rest-mass energy mc2.20
Now we have to go back and establish the conditions at which the Born approximation, when the
field of the scattered wave is negligible, is indeed valid for a point-object scattering. Since the scattered
wave’s intensity described by Eq. (26) diverges at r 0 as 1/r2, according to the definition (39) of the
cross-section, it may become comparable to Sincident at r2 ~ . However, Eq. (38) itself is only valid if r
>> , so the Born approximation does not lead to a contradiction only if
2 . (8.42)
For the Thompson scattering by an electron, this condition means >> rc ~ 310-15 m and is fulfilled for
all frequencies up to very hard -rays with photon energies ~100 MeV.
Possibly the most notable feature of the result (40) is its independence of the wave frequency. As
it follows from its derivation, particularly from Eq. (37), this independence is intimately related to the
unbound character of charge motion. For bound charges, say for electrons in gas molecules, this result is
only valid if the wave frequency is much higher than the frequencies j of most important quantum
transitions. In the opposite limit, << j, the result is dramatically different. Indeed, in this limit we
may approximate the molecule’s dipole moment by its static value (3.48):
p E . (8.43)
In the Born approximation, and in the absence of the molecular field effects mentioned in Sec. 3.3, E in
this expression is just the incident wave’s field, and we can use Eq. (28) to calculate the power of the
wave scattered by a single molecule:
18 Named after Sir Joseph John (“JJ”) Thomson, the discoverer of the electron – and isotopes as well! He should
not be confused with his son, G. P. Thomson, who discovered (simultaneously with C. Davisson and L. Germer)
quantum-mechanical wave properties of the same electron.
19 In the Gaussian units, this formula looks like r = q2/mc2 (giving, of course, the same numerical values: for the
c
electron, rc 2.8210-13 cm). This classical quantity should not be confused with the particle’s Compton
wavelength C 2/mc (for the electron, close to 2.2410-12m), which naturally arises in quantum
electrodynamics – see a brief discussion in the next chapter, and also QM Sec. 1.1.
20 It is fascinating how smartly the relativistic expression mc2 sneaked into the result (40)-(41), which was
obtained using the non-relativistic equation (36) of the particle motion. This was possible because the calculation
engaged electromagnetic waves, which propagate with the speed of light, and whose quanta (photons), as a result,
may be frequently treated as relativistic (moreover, ultra-relativistic) particles – see the next chapter.
Chapter 8 Page 10 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Z 0 4 2 2
P E . (8.44)
4 c 2
Now, using the last form of the definition (39) of the cross-section, we get a very simple result,
Z 02 4 2
, (8.45)
6 c 2
showing that in contrast to Eq. (40), at low frequencies changes as fast as 4.
Now let us explore the effect of such Rayleigh scattering on wave propagation in a gas, with a
relatively low volumic density n. We may expect (and will prove in the next section) that due to the
randomness of molecule positions, the waves scattered by individual molecules may be treated as
incoherent ones, so the total scattering power may be calculated just as the sum of those scattered by
each molecule. We can use this fact to write the balance of the incident’s wave intensity in a small
volume dV of length (along the incident wave direction) dz, and area A across it. Since such a segment
includes ndV = nAdz molecules, and according to Eq. (39), each of them scatters power S = P/A, the
total scattered power is nPdz; hence the incident power’s change is
dP nP dz . (8.46)
Comparing this equation with the definition (7.213) of the wave attenuation constant, applied to the
scattering, 21
dP scat P dz . (8.47)
we see that this effect gives the following contribution to attenuation: scat = n. From here, using Eq.
(3.50) to write = 0( – 1)/n, where is the dielectric constant, and Eq. (45) for , we get
k4 2
Rayleigh
scat 12 , where k . (8.48)
scattering
6 n 0 c
This is the famous Rayleigh scattering formula, which in particular explains the colors of blue
sky and red sunsets. Indeed, through the visible light spectrum, changes almost two-fold; as a result,
the scattering of blue components of sunlight is an order of magnitude higher than that of its red
components. For the air near the Earth’s surface, – 1 610-4, and n ~ 2.51025 m-3 – see Sec. 3.3.
Plugging these numbers into Eq. (48), we see that the effective length lscat 1/scat of scattering is ~30
km for the blue light and ~200 km for the red light.22 The effective thickness h of the Earth’s
atmosphere is ~10 km, so the Sun looks just a bit yellowish during most of the day. However, an
elementary geometry shows that at sunset, the light has to pass the length l ~ (REh)1/2 300 km to reach
an Earth-surface observer; as a result, the blue components of the Sun’s light spectrum are almost
completely scattered out, and even the red components are weakened very substantially.
21 I am sorry for using the same letter () for both the molecular polarizability and the wave attenuation, but both
notations are traditional. Hopefully, the subscript “scat” marking in the latter meaning minimizes the possibility
of confusion.
22 These values are approximate because both n and ( – 1) vary through the atmosphere’s thickness.
Chapter 8 Page 11 of 38
Essential Graduate Physics EM: Classical Electrodynamics
where r is the distance from the scatterer, at which the scattered wave is observed.23 Both the definition
and the notation may become clearer if we notice that according to Eq. (26), at large distances (r >> a),
the numerator of the right-hand side of Eq. (49), and hence the differential cross-section as a whole, do
not depend on r, and that its integral over the total solid angle = 4 coincides with the total cross-
section defined by Eq. (39):
d 1 1 P
dΩdΩ S r 2 S r dΩ S d r .
2
r (8.50)
4 incident 4 S incident r const S incident
For example, according to Eq. (26), the angular distribution of the radiation scattered by a single
dipole is rather broad; in particular, in the quasistatic case (43), within the Born approximation,
2
d k 2
sin 2 . (8.51)
dΩ 4 0
If the wave is scattered by a small dielectric body, with a characteristic size a << (i.e., ka << 1), then
all its parts re-radiate the incident wave coherently. Hence, we can calculate it similarly, just replacing
the molecular dipole moment (43) with the total dipole moment of the object – see Eq. (3.45):
p PV 1 0 EV , (8.52)
where V ~ a3 is the body’s volume. As a result, the differential cross-section may be obtained from Eq.
(51) with the replacement mol ( – 1)0V:
2
d k 2V
( 1) 2 sin 2 , (8.53)
d 4
i.e. follows the same sin2 law.
The situation for extended objects, with at least one dimension of the order of (or larger than) the
wavelength, is different: here we have to take into account the phase shifts between the wave’s re-
radiation by various parts of the body. Let us analyze this issue first for an arbitrary collection of similar
point scatterers located at points rj. If the wave vector of the incident plane wave is k0, the wave’s field
has the phase factor exp{ik0r} – see Eq. (7.79). At the location rj of the jth scattering center, this factor
equals exp{ik0rj}, defining the time dependence of the dipole vector p, and hence of the scattered wave.
23 Just as in the case of the total cross-section, this definition is also similar to that accepted at the particle
scattering – see, e.g., CM Sec. 3.5 and QM Sec. 3.3.
Chapter 8 Page 12 of 38
Essential Graduate Physics EM: Classical Electrodynamics
According to Eq. (17), the scattered wave with a wave vector k (with k = k0) acquires, on its way from
the source point rj to the observation point r, an additional phase factor exp{ik(r – rj)}, so the scattered
wave field is proportional to
expik 0 r j ik (r r j ) e ikr exp i (k k 0 ) r j . (8.54)
Since the first factor in the last expression does not depend on rj, to calculate the total scattering wave, it
is sufficient to sum up the last phase factors, exp{–iqrj}, where the vector
q k k0 (8.55)
has the physical sense of the wave vector change at scattering.24 It may look like the phase factor
depends on our choice of the reference frame. However, according to Eq. (7.42), the average intensity of
the scattered wave is proportional to EE*, i.e. to the following real scalar function of the vector q:
*
F (q) exp{iq r j } exp{iq r j' } exp{iq (r j r j ' )} I (q) ,
Scattering 2
function
(8.56)
j j' j, j'
is called the phase sum, and may be calculated in any reference frame without affecting the final result
given by Eq. (56).
So, besides the sin2 factor, the differential cross-section (49) of scattering by an extended
object is also proportional to the scattering function (56). Its double-sum form is convenient to notice
that for a system of many (N >> 1) similar but randomly located scatterers, only the terms with j = j’
accumulate at summation, so F(q), and hence d/d, scale as N, rather than N2 – thus justifying again
our treatment of the Rayleigh scattering problem in the previous section.
Now let us apply Eq. (56) to a simple problem of just two similar small scatterers, separated by a
fixed distance a:
2
qa a
F (q) exp{iq (r j r j ' )} 2 exp{iq a a} exp{iq a a} 21 cos q a a 4 cos 2 , (8.58)
j , j' 1 2
where qa qa/a is the component of the vector q along the vector a connecting the scatterers. The
apparent simplicity of this result may be a bit misleading because the mutual plane of the vectors k and
k0 (and hence of the vector q) does not necessarily coincide with the mutual plane of the vectors k0 and
E, so the scattering angle between k and k0 is generally different from (/2 – ) – see Fig. 5.
Moreover, the angle between the vectors q and a (within their common plane) is one more parameter
independent of both and . As a result, the angular dependence of the scattered wave’s intensity (and
hence d/d), which depends on all three angles, may be rather involved, but some of its details are
irrelevant for the basic physics of interference/diffraction.
24 In quantum mechanics, q has a very clear sense of the momentum transferred from the scattering object to the
scattered particle (for example, a photon), and this terminology is sometimes smuggled even into classical
electrodynamics texts.
Chapter 8 Page 13 of 38
Essential Graduate Physics EM: Classical Electrodynamics
E
kx nr
k
q k k0
Fig. 8.5. The angles important for the
ky 0 kz k0 general scattering problem.
This is why let me consider in detail only the simple cases when the vectors k, k0, and a all
reside in the same plane, with k0 normal to a – see Fig. 6a.
x (a) x (b)
k a k
q
a q 2
qa qa
Fig. 8.6. The
k0 z k0 z simplest cases of
a (a) interference and
a sin 2 a sin (b) diffraction.
As Fig. 6a shows, this condition may be readily understood as that of the in-phase addition (the
constructive interference) of two coherent waves scattered from the two points, when the difference
between their paths toward the observer, asin, is equal to an integer number of wavelengths. At each
such maximum, F = 4, due to the doubling of the wave amplitude and hence quadrupling its power.
If the distance between the point scatterers is large (ka >> 1), the first maxima (60) correspond to
small scattering angles, << 1. For this region, Eq. (59) is reduced to a simple periodic dependence of
function F on the angle . Moreover, within the range of small , the wave polarization factor sin2 is
virtually constant, so the angular dependence of the scattered wave’s intensity, and hence of the
differential cross-section, is also very simple:
d ka Young’s
F (q) 4 cos 2 . (8.61) interference
d 2 pattern
25 In optics especially, such intensity maxima/minima patterns are called interference fringes.
Chapter 8 Page 14 of 38
Essential Graduate Physics EM: Classical Electrodynamics
This simple interference pattern is well known from Young’s two-slit experiment.26 (As will be
discussed in the next section, the theoretical description of the two-slit experiment is more complex than
that of the Born scattering, but is preferable experimentally because, at such scattering, the wave of
intensity (61) has to be observed on the backdrop of a much stronger incident wave that propagates in
almost the same direction, = 0.)
A very similar analysis of scattering from N > 2 similar, equidistant scatterers, located along the
same straight line shows that the positions (60) of the constructive interference maxima do not change
(because the derivation of this condition is still applicable to each pair of adjacent scatterers), but the
increase of N makes these peaks sharper and sharper. Leaving the quantitative analysis of this system for
the reader’s exercise, let me jump immediately to the limit N 0, in which we may ignore the
scatterers’ discreteness. The resulting pattern is similar to that at scattering by a continuous thin rod (see
Fig. 6b), so let us first spell out the Born scattering formula for an arbitrary extended, continuous,
uniform dielectric body. Transferring Eq. (56) from the sum to an integral, for the differential cross-
section we get
2 2
d k 2 k2 2
( 1) 2 F (q) sin 2 ( 1) 2 I (q) sin 2 , (8.62)
d 4 4
where I(q) now becomes the phase integral,27
Phase
integral
I (q) exp iq r' d 3 r' , (8.63)
V
26 This experiment was described in 1803 by Thomas Young – one more universal genius of science, who also
introduced the Young modulus in the elasticity theory (see, e.g., CM Chapter 7), besides numerous other
achievements – including deciphering Egyptian hieroglyphs! It is fascinating that the first clear observation of
wave interference was made as early as 1666 by another genius, Sir Isaac Newton, in the form of so-called
Newton’s rings. Unbelievably, Newton failed to give the most natural explanation of his observations – perhaps
because he was vehemently opposed to the very idea of light as a wave, which was promoted in his times by
others, notably by Christian Huygens. Due to Newton’s enormous authority, only Young’s two-slit experiments
more than a century later have firmly established the wave picture of light – to be replaced by the dualistic
wave/photon picture formalized by quantum electrodynamics (see, e.g., QM Ch. 9), in one more century.
27 Since the observation point’s position r does not participate in this formula explicitly, the prime sign in r’ could
be dropped, but I keep it as a reminder that the integral is taken over points r’ of the scattering object.
Chapter 8 Page 15 of 38
Essential Graduate Physics EM: Classical Electrodynamics
sin
sinc . (8.66) Sinc
function
It vanishes at all points n = n with integer n, besides such point with n = 0: sinc0 sinc 0 = 1.
1.5
1.25
1
sin 0.75
0.5
0.25
0
0.25 Fig. 8.7. The sinc function.
0.5
5 4 3 2 1 0 1 2 3 4 5
/
The function F(q) = V2sinc2, given by Eq. (64) and plotted with the red line in Fig. 8, is called
the Fraunhofer diffraction pattern.28
1
0.8
F (q ) 0.6
V
0.4 Fig. 8.8. The Fraunhofer diffraction
pattern (solid red line) and its envelope
0.2
1/2 (dashed red line). For comparison,
0 the blue line shows the basic
3 2 1 0 1 2 3 interference pattern cos2 – cf. Eq. (61).
/ ka sin / 2
Note that it oscillates with the same argument’s period (kasin) = 2/ka as the interference
pattern (61) from two-point scatterers (shown with the blue line in Fig. 8). However, at the interference,
the scattered wave intensity vanishes at angles n’ that satisfy the condition
ka sin n' 1
n , (8.67)
2 2
i.e. when the optical path difference asin is equal to a semi-integer number of wavelengths /2 = /k,
and hence the two waves from the scatterers reach the observer in anti-phase – the so-called destructive
interference. On the other hand, for the diffraction on a continuous rod the minima occur at a different
set of scattering angles:
ka sin n
n, (8.68)
2
28 It is named after Joseph von Fraunhofer (1787-1826) – who invented the spectroscope, developed the
diffraction grating (see below), and also discovered the dark Fraunhofer lines in the Sun’s spectrum.
Chapter 8 Page 16 of 38
Essential Graduate Physics EM: Classical Electrodynamics
i.e. exactly where the two-point interference pattern has its maxima – please have a look at Fig. 8 again.
The reason for this relation is that the wave diffraction on the rod may be considered as a simultaneous
interference of waves from all its elementary fragments, and exactly at the observation angles when the
rod edges give waves with phases shifted by 2n, the interior points of the rod give waves with all
phases within this difference, with their algebraic sum equal to zero. As even more visible in Fig. 8, at
the diffraction, the intensity oscillations are limited by a rapidly decreasing envelope function 1/2 –
while at the two-point interference, the oscillations retain the same amplitude. The reason for this fast
decrease is that with each Fraunhofer diffraction period, a smaller and smaller fraction of the road gives
an unbalanced contribution to the scattered wave.
If the rod’s length is small (ka << 1, i.e. a << ), then the sinc function’s argument is small at
all scattering angles , so I(q) V, and Eq. (62) is reduced to Eq. (53). In the opposite limit, a >> , the
first zeros of the function I(q) correspond to very small angles , for which sin 1, so the differential
cross-section is just
2
d k 2 ka
( 1) 2 sinc 2 , (8.69)
d 4 2
i.e. Fig. 8 shows the scattering intensity as a function of the direction toward the observation point – if
this point is within the plane containing the rod.
Finally, let us discuss a problem of large importance for applications: calculate the positions of
the maxima of the interference pattern arising at the incidence of a plane wave on a very large 3D
periodic system of point scatterers. For that, first of all, let us quantify the notion of 3D periodicity. The
periodicity in one dimension is simple: the system we are considering (say, the positions of point
scatterers) should be invariant with respect to the linear translation by some period a, and hence by any
multiple sa of this period, where s is any integer. Anticipating the 3D generalization, we may require
any of the possible translation vectors R to that the system is invariant, to be equal sa, where the
primitive vector a is directed along the (so far, the only) axis of the 1D system.
Now we are ready for the common definition of the 3D periodicity – as the invariance of the
system with respect to the translation by any vector of the following set:
3
Bravais R sl a l , (8.70)
lattice
l 1
where sl are three independent integers, and {al} is a set of three linearly independent primitive vectors.
The set of geometric points described by Eq. (70) is called the Bravais lattice (first analyzed in detail,
circa 1850, by Auguste Bravais). Perhaps the most nontrivial feature of this relation is that the vectors al
should not necessarily be orthogonal to each other. (That requirement would severely restrict the set of
possible lattices and make it unsuitable for the description, for example, of many solid-state crystals.)
For the scattering problem we are considering, we will assume that the position rj of each point scatterer
coincides with one of the points R of some Bravais lattice, with a given set of primitive vectors al, so in
the basic Eq. (57), the index j is coding the set of three integers {s1, s2, s3}.
Now let us consider a similarly defined Bravais lattice, but in the reciprocal (wave-number)
space, numbered by three independent integers {t1, t2, t3}:
3
a m" a m'
Reciprocal Q tmb m , with b m 2 , (8.71)
lattice m 1 a m a m" a m'
Chapter 8 Page 17 of 38
Essential Graduate Physics EM: Classical Electrodynamics
where in the last expression, the indices m, m’, and m” are all different. This is the so-called reciprocal
lattice, which plays an important role in all physics of periodic structures, in particular in the quantum
energy-band theory.29 To reveal its most important property, and thus justify the above introduction of
the primitive vectors bm, let us calculate the following scalar product:
3 3
a m" a m' 3
a a m" a m'
R Q sl t m a l b m 2
l , m 1
sl t m a l
l , m 1 a m a m" a m'
2 s l t k l
l , m 1 a m a m" a m'
. (8.72)
Applying to the numerator of the last fraction the operand rotation rule of vector algebra,30 we see that
it is equal to zero if l m, while for l = m the whole fraction is evidently equal to 1. Thus the double
sum (72) is reduced to a single sum:
3 3
R Q 2 s l t l 2 nl , (8.73)
l 1 l 1
where each of the products nl sltl is an integer, and hence their sum,
3
n nl s1t1 s 2 t 2 s 3t 3 , (8.74)
l 1
is an integer as well, so the main property of the direct/reciprocal lattice couple is very simple:
Indeed, each of these sets has the same value of the integer n, defined by Eq. (74), as the original one:
n' s1' t1 s 2' t 2 s3' t 3 s1 S1t 3 t1 s 2 S 2 t 3 t 2 s3 S1t1 S 2 t 2 t 3 n . (8.77)
Since, according to Eq. (75), the vector of the distance between any pair of the corresponding points of
the direct Bravais lattice (70),
29 See, e.g., QM Sec. 3.4, where several particular Bravais lattices R, and their reciprocals Q, are considered.
30 See, e.g., MA Eq. (7.6).
31 For more reading on this important topic, I can recommend, for example, the classical monograph by B.
Cullity, Elements of X-Ray Diffraction, 2nd ed., Addison-Wesley, 1978. (Note that its title uses the alternative
name of the field, once again illustrating how blurry the boundary between the interference and diffraction is.)
Chapter 8 Page 18 of 38
Essential Graduate Physics EM: Classical Electrodynamics
32Named after Sir William Bragg and his son, Sir William Lawrence Bragg, who were the first to demonstrate (in
1912) the X-ray diffraction by atoms in crystals. The Braggs’ experiments have made the existence of atoms
(before that, a somewhat hypothetical notion ignored by many physicists) indisputable.
Chapter 8 Page 19 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Finally, note that the von Laue and Bragg rules, as well as the similar condition (60) for the 1D
system of scatterers, are valid not only in the Born approximation but also follow from any adequate
theory of scattering, because the phase sum (57) does not depend on the magnitude of the wave
propagating from each elementary scatterer, provided that they are all equal.
opaque
screen observation r
wave angle
source
r'
orifice
Fig. 8.10. Deriving the
Huygens principle.
However, another approximation called the Huygens (or “Huygens-Fresnel”) principle,34 is very
instrumental in the description of such situations. In this approach, the wave beyond the screen is
represented as a linear superposition of spherical waves of the type (17), as if they were emitted by
every point of the incident wave’s front that has arrived at the orifice. This approximation is valid if the
following strong conditions are satisfied:
a r , (8.83)
where r is the distance of the observation point from the orifice. In addition, as we have seen in the last
section, at small /a the diffraction phenomena are confined to angles ~ 1/ka ~ /a << 1. For
observation at such small angles, the mathematical expression of the Huygens principle, for the complex
amplitude f(r) of a monochromatic wave f(r, t) = Re[fe-it], is given by the following simple formula
e ikR 2
f (r ) C
orifice
f (r' )
R
d r' . (8.84)
33 Another complaint against the Born approximation is that it does not satisfy the so-called optical (or “forward
scattering”) theorem relating to scattering with k = k0. This relation is especially important for the quantum-
mechanical description of particle scattering, and in this series, will be discussed in its QM part (Sec. 3.3).
34 Named after Christian Huygens (1629-1695) who had conjectured the wave nature of light (which remained
controversial for more than a century, until T. Young’s experiments), and Augustin-Jean Fresnel (1788-1827)
who developed a quantitative theory of diffraction, and in particular gave a mathematical formulation of the
Huygens principle. (Note that Eq. (91), sufficient for the purposes of this course, is not its most general form.)
Chapter 8 Page 20 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Here f is any transverse component of any of the wave’s fields (either E or H),35 R is the distance
between point r’ at the orifice and the observation point r (i.e. the magnitude of vector R r – r’), and
C is a complex constant.
Before describing the proof of Eq. (84), let me carry out its sanity check – which also will give
us the constant C. Let us see what the Huygens principle gives for the case when the field under the
integral is a plane wave with the complex amplitude f(z), propagating along axis z, with an unlimited x-
y front, (i.e. when there is no opaque screen at all), so in Eq. (84) we should take the whole [x, y] plane,
say with z’ = 0, as the integration area– see Fig. 11.
x'
“source” observation
point r’ R r - r' point r = {0, 0, z}
Fig. 8.11. Applying the
0 z Huygens principle to a
y' plane incident wave.
Then, for the observation point with coordinates x = 0, y = 0, and z > 0, Eq. (84) yields
Before specifying the integration limits, let us consider the range x’, y’ << z. In this range, the
square root participating in Eq. (85) twice, may be approximated as
1/ 2
x' 2 y' 2 x' 2 y' 2 x' 2 y' 2 .
x' 2
y' z
2
2 1/ 2
z 1 z 1 z (8.86)
z2 2 z 2 2z
At z >> , the denominator of Eq. (85) is a much slower function of x’ and y’ than the exponent in the
numerator, and in the former case, it is sufficient (as we will check a posteriori) to keep just the main,
first term of expansion (86). With that, Eq. (85) becomes
e ikz ik x' 2 y' 2 e ikz
z
f ( z ) Cf (0) dx' dy' exp Cf (0) IxIy , (8.87)
2z z
where Ix and Iy are two similar integrals; for example,
I x exp
ikx' 2
2z
2z
dx' 2 2z
exp i d k
2 2
(8.88)
k
where (k/2z)1/2x’. These are the so-called Fresnel integrals. I will discuss them in more detail in the
next section (see, in particular, Fig. 13), and for now, only one property36 of these integrals is important
35 The fact that the Huygens principle is valid for any field component should not too surprising. In the limit a >>
, the real boundary conditions at the orifice edges are not important; it is only important for the screen that limits
the orifice, to be opaque. Because of this, the Huygens principle (84) is a part of the so-called scalar theory of
diffraction. (I will not have time to discuss the vector theory of these effects, which is more accurate at smaller a –
see, e.g., Chapter 11 of the monograph by M. Born and E. Wolf, cited at the end of Sec. 7.1.)
Chapter 8 Page 21 of 38
Essential Graduate Physics EM: Classical Electrodynamics
for us: if taken in symmetric limits [-0, +0], both of them rapidly converge to the same value, (/2)1/2,
as soon as 0 becomes much larger than 1. This means that even if we do not impose any exact limits on
the integration area in Eq. (85), this integral converges to the value
2
2 z 1 / 2 1 / 2 1 / 2
e ikz 2 i ikz
f ( z ) Cf (0) i C f (0)e , (8.89)
z
k
2 2
k
due to contributions from the central area with a linear size corresponding to ~ 1, i.e. to
1/ 2
z
x ~ y ~ ~ z ,
1/ 2
. (8.90)
k
so the net contribution from the front points r’ well beyond the range (90) is negligible.37 (Within our
assumptions (83), which in particular require to be much less than z, the diffraction angle x/z ~ y/z
~ (/z)1/2, corresponding to the important area of the front, is small.) According to Eq. (89), to sustain
the unperturbed plane wave propagation, f(z) = f(0)eikz, the constant C has to be taken equal to k/2i.
Thus, the Huygens principle’s prediction (84), in its final form, reads
k e ikR 2 Huygens
f (r )
2 i orifice
f (r' )
R
d r' , (8.91) principle
and describes, in particular, the straight propagation of the plane wave (in a uniform medium).
Let me pause to emphasize how nontrivial this result is. It would be a natural corollary of Eqs.
(25) (and the linear superposition principle) if all points of the orifice were filled with point scatterers
that re-emit all the incident waves as spherical waves. However, as it follows from the above example,
the Huygens principle also works if there is nothing in the orifice but the free space!
This is why let us discuss a proof of this principle,38 based on Green’s theorem (2.207). Let us
apply it to the function f = f where f is the complex amplitude of a scalar component of one of the
wave’s fields, which satisfies the Helmholtz equation (7.204),
2
k 2 f (r ) 0 , (8.92)
and the function g = G is the temporal Fourier image of the corresponding Green’s function. The latter
function may be defined, as usual, as the solution of the same equation with the added delta-functional
right-hand side with an arbitrary coefficient, for example,
2
k 2 G (r, r' ) 4 (r r' ) . (8.93)
Using Eqs. (92) and (93) to express the Laplace operators of the functions f and G, we may rewrite
Eq. (2.207) as
Chapter 8 Page 22 of 38
Essential Graduate Physics EM: Classical Electrodynamics
where n is the outward normal to the surface S limiting the integration volume V. Two terms on the left-
hand side of this relation cancel, so after swapping the arguments r and r’, we get
G r' , r G (r' ) 2
4 f (r ) f (r' ) g r' , r d r' . (8.95)
S
n' n'
This relation is only correct if the selected volume V includes the point r (otherwise we would
not get its left-hand side from the integration of the delta function), but does not include the genuine
source of the wave – otherwise, Eq. (92) would have a non-zero right-hand side. Now let r be the field
observation point, V be all the source-free half-space (for example, the space right of the screen in Fig.
10), so S is the surface of the screen, including the orifice. Then the right-hand side of Eq. (95) describes
the field (at the observation point r) induced by the wave passing through the orifice points r’. Since no
waves are emitted by the opaque parts of the screen, we can limit the integration by the orifice area.39
Assuming also that the opaque parts of the screen do not re-emit the waves “radiated” by the orifice, we
can take the solution of Eq. (93) to be the retarded potential for the free space:40
e ikR
G (r, r' ) . (8.96)
R
Plugging this expression into Eq. (82), we get
Kirchhoff e ikR e ikR f (r' ) 2
integral 4 f (r ) n' R
orifice
f (r' )
R n'
d r' . (8.97)
This is the so-called Kirchhoff (or “Fresnel-Kirchhoff”) integral. (Again, with the integration
extended over all boundaries of the volume V, this would be an exact mathematical result.) Now, let us
make two additional approximations. The first of them stems from Eq. (83): at ka >> 1, the wave’s
spatial dependence in the orifice area may be represented as
f (r' ) (a slow function of r' ) exp{ik 0 r'} , (8.98)
where “slow” means a function that changes on the scale of a rather than . If, also, kR >> 1, then the
differentiation in Eq. (97) may be, in both instances, limited to the rapidly changing exponents, giving
e ikR
4 f (r ) i k k 0 n' f (r' )d 2 r' . (8.99)
orifice
R
Second, if all observation angles are small, we may take kn’ k0n’ –k. With that, Eq. (99) is reduced
to the Huygens principle in its form (91).
39 Actually, this is a nontrivial point of the proof. Indeed, it may be shown that the exact solution of Eq. (94)
identically is equal to zero if f(r’) and f(r’)/n’ vanish together at any part of the boundary, of a non-zero area. A
more careful analysis of this issue (it is the task of the formal vector theory of diffraction, which I will not have
time to pursue) confirms the validity of the described intuition-based approach at a >> .
40 It follows, e.g., from Eq. (16) with a monochromatic source q(t) = q exp{-it}, with the amplitude q = 4
that fits the right-hand side of Eq. (93).
Chapter 8 Page 23 of 38
Essential Graduate Physics EM: Classical Electrodynamics
It is clear that the principle immediately gives a very simple description of the interference of
waves passing through two small holes in the screen. Indeed, if the holes’ sizes are negligible in
comparison with the distance a between them (though still are much larger than the wavelength!), Eq.
(91) yield
ikR1 ikR2
f (r ) c1e c2 e , with c1, 2 kf 1, 2 A1, 2 2 iR1, 2 , (8.100)
where R1,2 are the distances between the holes and the observation point, and A1,2 are the hole areas. For
the wave intensity, Eq. (100) gives
The first two terms in the last expression clearly represent the intensities of the partial waves passed
through each hole, while the last one is the result of their interference. The interference pattern’s
contrast ratio
2
S max c1 c 2
, (8.102)
S min c1 c 2
is the largest (infinite) when both waves have equal amplitudes.
The analysis of the interference pattern is simple if the line connecting the holes is perpendicular
to wave vector k k0 – see Fig. 6a. Selecting the coordinate axes as shown in that figure, and using for
the distances R1,2 the same expansion as in Eq. (86), for the interference term in Eq. (101) we get
kxa
cosk R1 R2 cos . (8.103)
z
This means that the term does not depend on y, i.e. the interference pattern in the plane of constant z is a
set of straight, parallel strips, perpendicular to the vector a, with the period given by Eq. (60), i.e. by the
Bragg law.41 This result is strictly valid only at y2 << z2; it is straightforward to use the next term in the
Taylor expansion (73) to show that farther on from the interference plane y = 0, the strips start to
diverge.
41 The phase shift vanishes at the normal incidence of a plane wave on the holes. Note, however, that the
spatial shift of the interference pattern following from Eq. (103), x = –(z/ka), is extremely convenient for the
experimental measurement of the phase shift between two waves, especially if it is induced by some factor (such
as insertion of a transparent object into one of the interferometer’s arms) that may be turned on/off at will.
Chapter 8 Page 24 of 38
Essential Graduate Physics EM: Classical Electrodynamics
incident x'
wave
diffracted
screen with wave x
observation
a slit
plane
a/2
0 z
a/2
Let us apply Eq. (91) to our current problem (Fig. 12), for the sake of simplicity assuming the
normal wave incidence, and taking z’ = 0 at the screen plane:
k
a
exp ik x x' y' 2 z 2
2
1/ 2
,
2 i
f ( x, z ) f 0 dx ' dy' (8.104)
a x x' 2
y' 2 z
2 1/ 2
where f0 f(x’, 0) = const is the incident wave’s amplitude. This is the same integral as in Eq. (85),
except for the finite limits for the integration variable x’, and may be simplified similarly, using the
small-angle condition (x – x’)2 + y’2 << z2:
k e ikz
a / 2 2
ik x x' y ' 2 k e ikz
f ( x, z ) f 0
2 i z dx' dy' exp
a / 2
2z
f0
2 i z
IxIy . (8.105)
(i) Fraunhofer diffraction takes place when z/a >> a/ – the relation which may be rewritten
either as a << (z)1/2, or as ka2 << z. In this limit, the ratio kx’2/z is negligibly small for all values of x’
under the integral, and we can approximate it as
a / 2
ik x 2 2 xx' x' 2 a / 2
ik x 2 2 xx'
Ix
a / 2
exp
2z
dx' exp
a / 2
2 z
dx'
a / 2
(8.108)
ikx 2 ikxx' 2z ikx 2 kxa
exp
2z
a / 2
exp
z
dx'
kx
exp
2 z
sin
2 z
,
Chapter 8 Page 25 of 38
Essential Graduate Physics EM: Classical Electrodynamics
k e ikz 2 z 2 i z
1/ 2
ikx 2 kxa
f ( x, z ) f 0 exp sin , (8.109)
2 i z kx k 2z 2z
and hence the relative wave intensity is
2
S ( x, z ) f ( x, z ) 8z 2 kxa 2 ka 2 ka Fraunhofer
sin sinc 2 , (8.110) diffraction
S0 f0 kx 2
2z z 2 pattern
where S0 is the intensity of the incident wave, and x/z << 1 is the observation angle. Comparing this
expression with Eq. (69), we see that this diffraction pattern is exactly the same as that for a similar
(uniform, 1D) object in the Born approximation – see the red line in Fig. 8. Note again that the angular
width of the Fraunhofer pattern is of the order of 1/ka, so its linear width x = z is of the order of
z/ka ~ z/a.42 Hence the condition of the Fraunhofer approximation’s validity may be also represented as
a << x.
(ii) Fresnel diffraction. In the opposite limit of a relatively wide slit, with a >> x = z ~ z/ka ~
z/a, i.e. ka2 >> z, the diffraction patterns at two edges of the slit are well separated. Hence, near each
edge (for example, near x’ = –a/2) we may simplify Eq. (107) as
ik x x'
2 1/ 2
expi d ,
2z
I x ( x) exp dx' 2
(8.111)
2z k
a / 2
k / 2 z
1/ 2
( x a / 2)
and express it via the special functions called the Fresnel integrals:43
1/ 2 1/ 2
C ( ) S ( )
2 2
cos( )d , sin ( )d , Fresnel
2 2
(8.112)
integrals
0 0
whose plots are shown in Fig. 13a. As was mentioned above, at large values of their argument (), both
functions tend to ½.
(a) (b)
0.8 1.5
C ( )
0.6 1
S ( ) ½
0.4 0.5
0.2 0
S ( )
0 0.5
0.5 0 0.5 1 1.5
0 2 4 6 8 10
C ( ) ½
Fig. 8. 13. (a) The Fresnel integrals and (b) their parametric representation.
42 Note also that since in this limit ka2 << z, Eq. (97) shows that even the maximum value S(0, z) of the diffracted
wave’s intensity is much lower than that (S0) of the incident wave. This is natural because the incident power S0a
per unit length of the slit is now distributed over a much larger width x >> a, so S(0, z) ~ S0 (a/x) << S0.
43 Slightly different definitions of these functions, affecting the constant factors, may also be met in literature.
Chapter 8 Page 26 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Plugging this expression into Eqs. (105) and (111), for the diffracted wave intensity, in the
Fresnel limit (i.e. at x + a/2 << a), we get
2
1
2
S ( x, z ) 1 k a 1 k 1 / 2
1/ 2
Fresnel a
diffraction C x S x . (8.113)
pattern S0 2 2 z 2 2 2 z 2 2
A plot of this function (Fig. 14) shows that the diffraction pattern is very peculiar: while in the “dark”
region x < –a/2 the wave intensity fades monotonically, the transition to the “bright” region within the
gap (x > –a/2) is accompanied by intensity oscillations, just as at the Fraunhofer diffraction – cf. Fig. 8.
1.5
S
S0 0.5
0
Fig. 8.14. The Fresnel
5 0 5 10 diffraction pattern.
k / 2 z 1 / 2 x a / 2
This behavior, which is described by the following asymptotes,
1 sin( 2 / 4)
1/ 2
k a
1 , for x ,
S 2z 2
(8.114)
S0 1 , for ,
4 2
is essentially an artifact of “observing” just the wave intensity (i.e. its real amplitude) rather than its
phase as well. Indeed, as may be seen even more clearly from the parametric representation of the
Fresnel integrals, shown in Fig. 13b, these functions oscillate similarly at large positive and negative
values of their argument. (This famous pattern is called either the Euler spiral or the Cornu spiral.)
Physically, this means that the wave diffraction at the slit edge leads to similar oscillations of its phase
at x < –a/2 and x > –a/2; however, in the latter region (i.e. inside the slit) the diffracted wave overlaps
the incident wave passing through the slit directly, and their interference reveals the phase oscillations,
making them visible in the observed intensity as well.
Note that according to Eq. (113), the linear scale x of the Fresnel diffraction pattern is of the
order of (2z/k)1/2, i.e. complies with the estimate given by Eq. (90). If the slit is gradually narrowed so
that its width a becomes comparable to x,44 the Fresnel diffraction patterns from both edges start to
“collide” (interfere). The resulting wave, fully described by Eq. (107), is just a sum of two contributions
of the type (111) from both edges of the slit. The resulting interference pattern is somewhat complicated,
and only when a becomes substantially less than x, it is reduced to the simple Fraunhofer pattern (110).
44 Note that this condition may be also rewritten as a ~ x, i.e. z/a ~ a/.
Chapter 8 Page 27 of 38
Essential Graduate Physics EM: Classical Electrodynamics
Of course, this crossover from the Fresnel to Fraunhofer diffraction may be also observed, at fixed
wavelength and slit width a, by increasing z, i.e. by measuring the diffraction pattern farther and
farther away from the slit.
Note also that the Fraunhofer limit is always valid if the diffraction is measured as a function of
the diffraction angle alone. This may be done, for example, by collecting the diffracted wave with a
“positive” (converging) lens and observing the diffraction pattern in its focal plane.
45 In application to optical waves, this notion may be traced back to at least the work by Hero (a.k.a. Heron) of
Alexandria (circa 170 AD). Curiously, he correctly described light reflection from one or several plane mirrors,
starting from the completely wrong idea of light propagation from the eye of the observer to the observed object.
46 Admittedly, even this list leaves aside several spectacular effects, including such a beauty as conical refraction
in biaxial crystals – see, e.g., Chapter 15 of the textbook by M. Born and E. Wolf, cited in the end of Sec. 7.1.
47 My top recommendation for that purpose would be Chapters 3-6 and Sec. 8.6 in Born and Wolf. A simpler
alternative is Chapter 10 in G. Fowles, Introduction to Modern Optics, 2nd ed., Dover, 1989. Note also that the
venerable field of optical microscopy is currently revitalized by holographic/tomographic methods, using the
Chapter 8 Page 28 of 38
Essential Graduate Physics EM: Classical Electrodynamics
I x I y exp
ik x x' y y'
2 2
dx'dy'
2z
orifice (8.115)
ik ( x 2 y 2 ) kxx' kyy'
exp exp i i dx'dy' ,
2z orifice z z
so besides the inconsequential total phase factor, Eq. (105) is reduced to
General
f (ρ) f 0 exp iκ ρ' d ' f 0 T (ρ' ) exp iκ ρ'd ' .
2 2
Fraunhofer
diffraction
(8.116)
pattern orifice all screen
Here the 2D vector (not to be confused with wave vector k, which is virtually perpendicular to !) is
defined as
ρ
κ k q k -k0 , (8.117)
z
and = {x, y} and ’ = {x’, y’} are 2D radius vectors in the, respectively, observation and orifice planes
– both nearly normal to the vectors k and k0.48 In the last form of Eq. (116), the function T(’) describes
the screen’s transparency at point ’, and the integral is over the whole screen plane z’ = 0. (Though the
two forms of Eq. (116) are strictly equivalent only if T(’ ) is equal to either 1 or 0, its last form may be
readily obtained from Eq. (91) with f(r’) = T(’ )f0 for any transparency profile, provided that T(’ ) is
any function that changes substantially only at distances much larger than 2/k.)
From the mathematical point of view, the last form of Eq. (116) is just the 2D spatial Fourier
transform of the function T(’), with the variable defined by the observation point’s position:
(z/k) (z/2). This interpretation is useful because of the experience we all have with the Fourier
transform, if only in the context of its time/frequency applications. For example, if the orifice is a single
small hole, T(’) may be approximated by a delta function, so Eq. (116) yields f() const. This
result corresponds (at least for the small diffraction angles /z, for which the Huygens
approximation is valid) to a spherical wave spreading from the point-like orifice. Next, for two small
holes, Eq. (116) immediately gives the interference pattern (103). Let me now use Eq. (116) to analyze
other simplest (and most important) 1D transparency profiles, leaving a few 2D cases for the reader’s
exercise.
(i) A single slit of width a (Fig. 12) may be described by transparency
scattered wave’s phase information. These methods are especially productive in biology and medicine – see, e.g.,
M. Brezinski, Optical Coherence Tomography, Academic Press, 2006, and G. Popescu, Quantitative Phase
Imaging of Cells and Tissues, McGraw-Hill (2011).
48 Note that for a thin uniform plate of the same shape as the orifice we are discussing now, the Born phase
integral (63) with q << k gives a result functionally similar to Eq. (116).
Chapter 8 Page 29 of 38
Essential Graduate Physics EM: Classical Electrodynamics
naturally returning us to Eqs. (64) and (110), and hence to the red lines in Fig. 8 for the wave intensity.
(Please note again that Eq. (116) describes only the Fraunhofer, but not the Fresnel diffraction!)
(ii) Two infinitely narrow, similar, parallel slits with a larger distance a between them (i.e. the
simplest model of Young’s two-slit experiment) may be described by taking
a a
T (ρ' ) x' x' , (8.120)
2 2
so Eq. (116) yields the generic 1D interference pattern,
i a i a a kxa
f (ρ) f 0 exp x exp x cos x cos , (8.121)
2 2 2 2z
whose intensity is shown with the blue line in Fig. 8.
(iii) In a more realistic model of Young’s experiment, each slit has a width (say, w) that is much
larger than the light wavelength , but still much smaller than the slit spacing a. This situation may be
described by the following transparency function
a 10 w
0.8
2
f (ρ) 0.6
f ( 0)
0.4
0.2
0
5 0 5
x /(2 z / kw)
Fig. 8.15. Young’s double-slit interference pattern for a finite-width slit.
Chapter 8 Page 30 of 38
Essential Graduate Physics EM: Classical Electrodynamics
(iv) A structure very useful for experimental and engineering practice is a set of many parallel,
similar slits, called the diffraction grating.49 If the slit’s width is much smaller than period d of the
grating, its transparency function may be approximated as
T (ρ' ) ( x' nd ) ,
n
(8.124)
This sum vanishes for all values of xd that are not multiples of 2, so the result describes sharp
intensity peaks at the following diffraction angles:
x x 2
m m m. (8.126)
z m k m kd d
Taking into account that this result is only valid for small angles m << 1, it may be interpreted
exactly as Eq. (59) – see Fig. 6a. However, in contrast with the interference (121) from two slits, the
destructive interference from many slits kills the net wave as soon as the angle is even slightly different
from each value (60). This is very convenient for spectroscopic purposes because the diffraction lines
produced by multi-frequency waves do not overlap even if the frequencies of their adjacent components
are very close.
Two unavoidable features of practical diffraction gratings make their properties different from
this simple, ideal picture. First, the finite number N of slits, which may be described by limiting the sum
(125) to the interval n = [-N/2, +N/2], results in a non-zero spread, / ~ 1/N, of each diffraction peak,
and hence in the reduction of the grating’s spectral resolution. (Unintentional variations of the inter-slit
distance d have a similar effect, so before the advent of high-resolution photolithography, special high-
precision mechanical tools had been used for grating fabrication.)
Second, a finite slit width w leads to the diffraction peak pattern modulation by the sinc2(kw/2)
envelope, similar to the pattern shown in Fig. 15. Actually, for spectroscopic purposes, such modulation
is sometimes a plus, because only one diffraction peak (say, with m = 1) is practically used, and if the
frequency spectrum of the analyzed wave is very broad (covers more than one octave), the higher peaks
produce undesirable hindrance. Because of this reason, w is frequently selected to be equal exactly to
d/2, thus suppressing each other diffraction maximum. Moreover, sometimes semi-transparent films are
used to make the transparency function T(r’) continuous and close to a sinusoidal one:
2x' T 2x' 2x'
T (ρ' ) T0 T1 cos T0 1 expi exp i . (8.127)
d 2 d d
Plugging the last expression into Eq. (116) and integrating, we see that the output wave consists of just 3
components: the direct-passing wave (proportional to T0) and two diffracted waves (proportional to T1)
propagating in the directions of the two lowest Bragg angles, 1 = /d.50
49 The rudimentary diffraction grating effect, produced by the parallel fibers of a bird’s feather, was discovered as
early as 1673 by James Gregory (who also invented the “Gregorian” telescope – one of the basic designs for
reflecting telescopes).
Chapter 8 Page 31 of 38
Essential Graduate Physics EM: Classical Electrodynamics
The same Eq. (116) may be also used to obtain one more general (and rather curious) result,
called the Babinet principle.51 Consider two experiments with the diffraction of similar plane waves on
two “complementary” screens – such that together they would cover the whole plane, without a hole or
an overlap. (Think, for example, about an opaque disk of radius R and a large opaque screen with a
round orifice of the same radius.) Then, according to the Babinet principle, the diffracted wave patterns
produced by these two screens in all directions with 0 are identical.
The proof of this principle is straightforward: since the transparency functions produced by the
screens are complementary:
T (ρ' ) T1 (ρ' ) T2 (ρ' ) 1, (8.128)
and the diffracted wave is (in the Fraunhofer approximation only!) a Fourier transform of T(’), which is
a linear operation, we get
f1 (ρ) f 2 (ρ) f 0 (ρ), (8.129)
where f0 is the wave “scattered” by the composite screen with T0(’) 1, i.e. the unperturbed initial
wave propagating in the initial direction ( = 0). In all other directions, f1 = –f2, i.e. the diffracted waves
are indeed similar besides the difference in sign – which is equivalent to a phase shift by . However, it
is important to remember that the Babinet principle notwithstanding, in real experiments, with screens at
finite distances, the diffracted waves may interfere with the unperturbed plane wave f0(), leading to
different diffraction patterns in cases 1 and 2 – see, e.g., Fig. 14 and its discussion.
50 Similar tricks are used in the so-called phased-array antennas, broadly used in radar systems and
radioastronomy, in which electronically controlled mutual phase shifts of microwave signals feeding many similar
component antennas are used to steer the direction of the resulting narrow beam. For more on this important
technology, see, e.g. T. Milligan, Modern Antenna Design, 2nd ed., Wiley (2005).
51 Named after Jacques Babinet (1784-1874) who made several important contributions to optics.
Chapter 8 Page 32 of 38
Essential Graduate Physics EM: Classical Electrodynamics
so Eq. (17b) yields A = Ad + A’, where Ad is the electric dipole contribution as given by Eq. (23), and
A’ is the new term of the next order in the small parameter r’ << r:
A' r, t jr' , t' r' n d 3 r' . (8.132)
4 rv t'
Just as it was done in Sec. 2, let us evaluate this term for a system of non-relativistic particles
with electric charges qk and radius vectors rk(t):
d
A' r, t q k rk rk n . (8.133)
4 rv dt k t t'
Using the “bac minus cab” identity of the vector algebra again,52 the vector operand of Eq. (133) may be
rewritten as
1 1 1 1 1
rk rk n rk rk n rk n rk (rk rk ) n rk n rk rk n rk
2 2 2 2 2
(8.134)
1 1 d
(rk rk ) n rk n rk ,
2 2 dt
so the right-hand side of Eq. (133) may be represented as a sum of two terms, A’ = Am + Aq, where
t n,
r 1
A m r, t t' n
m m with mt rk t q k rk t ; (8.135)
4 rv 4 rv v 2 k
d2
A q r, t q r n r . (8.136)
8 rv dt 2
k k k
k t t'
Comparing the second of Eqs. (135) with Eq. (5.91), we see that m is just the total magnetic
moment of the source. On the other hand, the first of Eqs. (135) is absolutely similar in structure to Eq.
(23), with p replaced with (mn)/v, so for the corresponding component of the magnetic field it gives
(in the same approximation r >> ) a result similar to Eq. (24):
Magnetic
dipole r r
radiation: B m (r, t ) t n
m t n .
n m (8.137)
field 4 rv v 4 rv 2
v
According to this expression, just as at the electric dipole radiation, the vector B is perpendicular to the
vector n, and its magnitude is also proportional to sin, where is now the angle between the direction
toward the observation point and the second time derivative of the vector m – rather than p:
r
Bm t sin .
m (8.138)
4 rv 2
v
As a result, the intensity of this magnetic dipole radiation has a similar angular distribution:
Magnetic 2
dipole Z r
radiation: S r ZH 2
t sin 2
m (8.139)
power (4 v 2 r ) 2 v
Chapter 8 Page 33 of 38
Essential Graduate Physics EM: Classical Electrodynamics
- cf. Eq. (26), besides the (generally) different meaning of the angle .
Note, however, that this radiation is usually much weaker than its electric-dipole counterpart.
For example, for a non-relativistic particle with electric charge q, moving on a trajectory of linear size
~a, the electric dipole moment is of the order of qa, while its magnetic moment scales as qa2, where
is the motion frequency. As a result, the ratio of the magnetic and electric dipole radiation intensities is
of the order of (a/v)2, i.e. the squared ratio of the particle’s speed to the speed of the emitted waves –
that has to be much smaller than 1 for our non-relativistic calculation to be valid.
The angular distribution of the electric quadrupole radiation described by Eq. (136) is more
involved. To show this, let us add to Aq a vector parallel to n (i.e. directed along the wave’s
propagation), getting
r
A q r, t
Q t , where Q q k 3rk n rk nrk2 ,
24 rv v
(8.140)
k
since this addition does not contribute to the transverse components of the electric and magnetic fields,
i.e. to the radiated wave. According to the above definition of the vector Q , its Cartesian components
may be represented as
3
Qj Qjj ' n j ' , (8.141)
j '1
where Qjj’ are the elements of the electric quadrupole tensor of the system – see the last of Eqs. (3.4):53
Now taking the curl of the first of Eqs. (140) at r >> , we get
Electric
r
B q r, t n Q t . (8.143) quadrupole
radiation:
24 rv 2
v field
This expression is similar to Eqs. (24) and (137), but according to Eqs. (140) and (142), components of
the vector Q do depend on the direction of the vector n, leading to a different angular dependence of Sr.
As the simplest example, let us consider the system of two equal point electric charges moving
symmetrically, at equal distances d(t) << from a stationary center – see Fig. 16.
n
x
q q
z Fig. 8.16. The simplest system emitting
electric quadrupole radiation.
d (t ) d (t )
Due to the symmetry of the system, its dipole moments p and m (and hence its electric and
magnetic dipole radiation) vanish, but the quadrupole tensor (142) still has non-zero elements. With the
coordinate choice shown in Fig. 16, these elements are diagonal:
53 Let me hope that the reader has already acquired some experience in the calculation of this tensor’s elements –
e.g., for the simple systems specified in Problems 3.2-3.4.
Chapter 8 Page 34 of 38
Essential Graduate Physics EM: Classical Electrodynamics
With the x-axis selected within the common plane of the z-axis and the direction n toward the
observation point (Fig. 16), so nx = sin, ny = 0, and nz = cos, Eq. (141) yields
and the vector product in Eq. (143) has only one non-vanishing Cartesian component:
n Q
y
d3
n z Qx n x Qz 6q sin cos
dt 3
d 2 (t ) .
(8.146)
As a result, the quadrupole radiation intensity, S Bq2, is proportional to sin2cos2, i.e. vanishes not
only along the symmetry axis of the system (as the electric-dipole and the magnetic-dipole radiations
would), but also in all directions perpendicular to this axis, reaching its maxima at = /4.
For more complex systems, the angular distribution of the electric quadrupole radiation may be
different, but it may be proved that its total (instant) power always obeys the following simple formula:
Electric
Q .
3
quadrupole Z 2
radiation: Pq (8.147)
720v 4
jj'
power j , j' 1
Let me finish this section by giving, also without proof, one more fact important for some
applications: due to their different spatial structure, the magnetic-dipole and electric-quadrupole
radiation fields do not interfere, i.e. the total power of radiation (neglecting the electric-dipole and
higher multipole terms) may be found as the sum of these components, calculated independently. On the
contrary, the electric-dipole and magnetic-dipole radiations of the same system typically interfere
coherently, so their radiation fields (rather than powers) should be summed up.
8.2. Simplify the Lorentz reciprocity theorem (6.121) for space-localized field sources. Then find
out what it says about the fields of two compact, well-separated sources of electric-dipole radiation.
8.3. In the electric-dipole approximation, calculate the angular distribution and the total power of
electromagnetic radiation by the hydrogen atom within the following classical model: an electron
rotates, at a constant distance R, about a much heavier proton. Use this result to calculate the law of a
gradual reduction of R in time. Finally, evaluate the classical lifetime of the atom by borrowing the
initial value of R from quantum mechanics: R(0) = rB 0.5310-10 m.
8.4. A non-relativistic particle of mass m, with electric charge q, is placed into a time-
independent uniform magnetic field B. Derive the law of decrease of the particle’s kinetic energy due to
Chapter 8 Page 35 of 38
Essential Graduate Physics EM: Classical Electrodynamics
its electromagnetic radiation at the cyclotron frequency c = qB/m. Evaluate the rate of such radiation
cooling of electrons in a magnetic field of 1 T, and estimate the energy interval in which this result is
quantitatively correct.
Hint: The cyclotron motion will be discussed in detail (for arbitrary particle velocities) in Sec.
9.6 below, but I hope that the reader already knows that in the non-relativistic case (v << c), the above
formula for c may be readily obtained by combining the 2nd Newton law mv2/R = qvB for the
particle’s circular rotation under the effect of the magnetic component of the Lorentz force (5.10), and
the geometric relation v = Rc. (Here v is the particle’s velocity in the plane normal to the vector B.)
8.5. A particle with mass m, electric charge q, and an initial kinetic energy T << mc2 collides
head-on with a much more massive particle of charge Zq, in free space. Calculate the total energy of
electromagnetic radiation during this collision, assuming it to be much lower than T.
8.6. Solve the dipole antenna radiation problem discussed in Sec. 2 (see Fig. 3) for the optimal
value l = /2 of its length, assuming that the current distribution in each of its arms is sinusoidal: I(z, t) =
I0cos(z/l) cost. 54
8.7. A plane wave is scattered by a localized object in free space. Relate the differential cross-
section of the wave’s scattering to the average force it exerts on the object. Use this general relation to
calculate the force exerted by a plane monochromatic wave on a free non-relativistic particle and
compare the result with those obtained in Problems 7.4 and 7.5.
8.8. Use the Lorentz oscillator model of a bound charge, given by Eq. (7.30), to explore the
transition between the two scattering limits discussed in Sec. 3 and, in particular, the resonant scattering
taking place at 0. In the last context, discuss the contribution of scattering to the oscillator’s
damping.
8.9.* A sphere of radius R, made of a material with a uniform permanent electric polarization P0
and a constant mass density , is free to rotate about its center. Calculate its average total cross-section
for scattering of a linearly polarized plane electromagnetic wave of frequency << R/c, incident from
free space, in the weak-wave limit, assuming that the initial orientation of the polarization vector P0 is
random.
8.11. Use the Born approximation to calculate the differential cross-section of a plane wave’s
scattering by a uniform dielectric sphere of an arbitrary radius R. In the limits kR << 1 and 1 << kR
(where k is the wave number), analyze the angular dependence of the differential cross-section and
calculate the total cross-section of scattering.
54 As was emphasized in Sec. 2, this is a reasonable guess rather than a controllable approximation. The exact
(rather involved!) theory shows that this assumption gives errors ~5%, depending on the wire’s diameter.
Chapter 8 Page 36 of 38
Essential Graduate Physics EM: Classical Electrodynamics
8.12. A sphere of radius R is made of a uniform dielectric material, with an arbitrary dielectric
constant. Calculate its total cross-section of scattering a linearly-polarized low-frequency (k << 1/R)
wave and compare the result with the solution of the previous problem.
8.13. Use the Born approximation to calculate the differential cross-section of a plane wave’s
scattering on a right circular cylinder of length l and radius R, for an arbitrary angle of incidence.
8.14. Formulate the quantitative condition of the Born approximation’s validity for a uniform
dielectric scatterer, with all linear dimensions of the order of the same scale a.
8.15. If a scatterer absorbs some part of the incident wave’s power, it may be characterized by an
absorption cross-section a defined similarly to Eq. (39) for the scattering cross-section:
Pa
a 2
,
E / 2 Z 0
where the numerator is the time-averaged absorbed power. Use two different approaches to calculate a
of a very small sphere of radius R << k–1,s, made of a nonmagnetic material with an Ohmic
conductivity and the high-frequency permittivity opt = 0. Can a of such a sphere be larger than its
geometric cross-section R2?
8.16. Use the Huygens principle to calculate the wave’s intensity on the symmetry plane of the
slit diffraction experiment (i.e. at x = 0 in Fig. 12), for arbitrary ratio z/ka2.
8.19. Use the Huygens principle to analyze the Fraunhofer diffraction of a plane wave normally
incident on a square-shaped hole, of size aa, in an opaque screen. Sketch the diffraction pattern you
would observe at a sufficiently large distance, and quantify the expression “sufficiently large” for this
case.
8.20. Use the Huygens principle to analyze the propagation of a monochromatic Gaussian beam
described by Eq. (7.181), with the initial characteristic width a0 >> , in a uniform isotropic medium.
Use the result for a semi-quantitative derivation of the so-called Abbe limit for the spatial resolution of
Chapter 8 Page 37 of 38
Essential Graduate Physics EM: Classical Electrodynamics
an optical system: wmin = /2sin, where is the half-angle of the wave cone propagating from the
object and captured by the system.
T
8.21. Within the Fraunhofer approximation,
1
analyze the pattern produced by a diffraction grating
with the 1D-periodic transparency profile shown in w
the figure on the right, for the normal incidence of a
monochromatic plane wave. d 0 d x
q
8.22. N equal point charges are attached, at equal intervals, to a circle
rotating with a constant angular velocity about its center – see the figure on the R
right. For what values of N does the system emit:
2 / N
(i) the electric dipole radiation?
(ii) the magnetic dipole radiation?
(iii) the electric quadrupole radiation?
8.24. Calculate the angular distribution and the total power radiated by a small planar loop
antenna of radius R, fed with ac current with frequency and amplitude I0, into free space.
8.25. The orientation of a magnetic dipole, with a constant magnitude m of its moment, is
rotating about a certain axis with an angular velocity , with the angle between them staying constant.
Calculate the angular distribution and the average power of its radiation into the free space.
8.26. Solve Problem 12 (also in the low-frequency limit kR << 1), for the case when the sphere’s
material has a frequency-independent Ohmic conductivity , and opt = 0, in two limits:
(i) of a very large skin depth (s >> R), and
(ii) of a very small skin depth (s << R).
8.27. Complete the solution of the problem started in Sec. 9, by calculating the full power of
radiation of the system of two charges oscillating in antiphase along the same straight line – see Fig. 16.
Also, calculate the average radiation power for the case of harmonic oscillations, d(t) = acost,
compare it with the case of a single charge performing similar oscillations, and interpret the difference.
8.28. The system of four alternating charges located at the angles of a square, considered in
Problem 3.3(i), is now being rotated around the axis normal to their plane and passing through the
square’s center, with a constant angular frequency << v/a. Calculate the time-averaged angular
distribution and the total power of the resulting radiation.
Chapter 8 Page 38 of 38
Essential Graduate Physics EM: Classical Electrodynamics
y y'
r { x, y , z}
r' { x' , y' , z' }
v
0 0'
x x' Fig. 9.1. Translational mutual motion of
two reference frames.
z z'
In the non-relativistic (Newtonian) mechanics the problem of transfer between such reference
frames has a simple solution at least in the limit v << c, because the basic equation of particle dynamics
(the 2nd Newton law) 1
mk rk k U (rk rk ' ) , (9.1)
k'
where U is the potential energy of inter-particle interactions, is invariant with respect to the so-called
Galilean transformation (or just “transform” for short).2 Choosing the coordinates in both frames so that
their axes x and x’ are parallel to the vector v (as in Fig. 1), the transform may be represented as
1 Let me hope that the reader does not need a reminder that for Eq. (1) to be valid, the reference frames 0 and 0’
have to be inertial – see, e.g., CM Sec. 1.2.
2 It had been first formulated by Galileo Galilei, if only rather informally, as early as 1638 – four years before
Isaac Newton was born! Note also the very unfortunate term “boost” used sometimes to describe such
translational transformations. (It is especially unnatural in the special relativity, not describing accelerations.) In
my course, this term is avoided, with the equivalent “transform” used instead.
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
Galilean
x x' vt' , y y' , z z' , t t' , (9.2a) transform
and plugging Eq. (2a) into Eq. (1), we get an absolutely similarly looking equation of motion in the
“moving” reference frame 0’. Since the reciprocal transform,
x' x vt , y y' , z' z , t' t , (9.2b)
is similar to the direct one, with the replacement of (+v) with (–v), we may say that the Galilean
invariance means that there is no “master” (absolute) spatial reference frame in classical mechanics,
although the spatial and temporal intervals between different instant events are absolute, i.e. reference-
frame invariant: x = x’,…, t = t’.
However, it is straightforward to use Eq. (2) to check that the form of the wave equation
2 2 2 1 2
2 2 2 2 2 f 0 , (9.3)
x y z c t
describing, in particular, the electromagnetic wave propagation in free space,3 is not Galilean-invariant.4
For the “usual” (say, elastic) waves, which obey a similar equation albeit with a different speed,5 this
lack of Galilean invariance is natural and is compatible with the invariance of Eq. (1), from which the
wave equation originates. This is because the elastic waves are essentially the oscillations of interacting
particles of a certain medium (e.g., an elastic solid), making the reference frame connected to this
medium, special. So, if the electromagnetic waves were oscillations of a certain special medium (which
was first called the “luminiferous aether”6 and later aether – or just “ether”), similar arguments might be
applicable to reconcile Eqs. (2) and (3).
The detection of such a medium was the goal of the measurements carried out between 1881 and
1887 (with better and better precision) by Albert Abraham Michelson and Edward Williams Morley,
which are sometimes called “the most famous failed experiments in physics”. Figure 2 shows a crude
scheme of these experiments.
mirror
vR vE
semi- expt 1
light transparent vE
source mirror
mirror expt 2
3 The discussions in this chapter and most of the next chapter will be restricted to the free-space (and hence
dispersion-free) case; some media effects on the radiation by relativistic particles will be discussed in Sec.10.4.
4 It is interesting that the usual (non-relativistic) Schrödinger equation, whose fundamental solution for a free
particle is a similar monochromatic wave (albeit with a different dispersion law), is Galilean-invariant, with a
certain change of the wavefunction’s phase – see, e.g., QM Chapter 1.
5 See, e.g., CM Secs. 6.5 and 7.7.
6 In ancient Greek mythology, aether is the clean air breathed by the gods residing on Mount Olympus.
Chapter 9 Page 2 of 56
Essential Graduate Physics EM: Classical Electrodynamics
A nearly monochromatic wave from a light source is split into two parts (optimally, of equal
intensity), using a semi-transparent mirror tilted by the angle /4 to the incident wave direction. These
two partial waves are reflected back by two fully-reflecting mirrors and arrive at the same semi-
transparent mirror again. Here half of each wave is directed toward the light source (they vanish there
without affecting the source), but another half is passed toward an intensity detector, forming, with its
counterpart, an interference pattern similar to that in the Young experiment. Thus each of the interfering
waves has traveled twice (back and forth) each of two mutually perpendicular “arms” of the
interferometer. Assuming that the aether, in which light propagates with speed c, moves with speed v < c
along one of the arms, of length ll, it is straightforward (and hence left for the reader’s exercise :-) to get
the following expression for the difference between the light roundtrip times:
2 2
lt ll l v
t , (9.4)
c 1 v 2 / c 2 1 / 2
1 v / c
2 2
cc
where lt is the length of the second, “transverse” arm of the interferometer (perpendicular to v), and the
last, approximate expression is valid at lt ll l and v << c.
Since the Earth moves around the Sun with a speed vE 30 km/s 10-4 c, the arm positions
relative to this motion alternate, due to the Earth’s rotation about its axis, every 6 hours – see the right
panel of Fig. 2. Hence if we assume that the aether rests in the Sun’s reference frame, then t (and the
corresponding shift of the interference fringes), has to change its sign with this half-period as well. The
same alternation may be achieved, at a smaller time scale, by a deliberate rotation of the instrument by
/2. In the most precise version of the Michelson-Morley experiment (circa 1887), this shift was
expected to be close to 0.4 of the interference pattern period. The results of the search for such a shift
were negative, with the error bar about 0.01 of the period.7
The most prominent immediate explanation for this zero result8 was suggested in 1889 by
George Francis FitzGerald and (independently and more qualitatively) by H. Lorentz in 1892: as evident
from Eq. (4), if the longitudinal arm of the interferometer itself experiences the so-called length
contraction:
1/ 2
v2
ll (v) ll (0)1 2 , (9.5)
c
while the transverse arm’s length is not affected by its motion through the aether, this effect kills the
shift t. This radical idea received strong support from the proof, in 1887-1905, that the Maxwell
equations, and hence the wave equation (3), are form-invariant under the so-called Lorentz transform,9
which in particular describes Eq. (5). For the choice of coordinates shown in Fig. 1, the transform reads
7 Through the 20th century, the Michelson-Morley-type experiments were repeated using more and more refined
experimental techniques, always with zero results for the apparent aether motion speed. For example, recent
experiments using cryogenically cooled optical resonators have reduced the upper limit for such speed to just
310-15 c –see H. Müller et al., Phys. Rev. Lett. 91, 020401 (2003).
8 The zero result of a slightly later experiment, namely a precise measurement of the torque that should be exerted
by the moving aether on a charged capacitor, carried out in 1903 by F. Trouton and H. Noble (following G.
FitzGerald’s suggestion), seconded the Michelson and Morley’s conclusions.
9 The theoretical work toward this result included important contributions by Woldemart Voigt (in 1887), Hendrik
Lorentz (in 1892-1904), Joseph Larmor (in 1897 and 1900), and Henri Poincaré (in 1900 and 1905).
Chapter 9 Page 3 of 56
Essential Graduate Physics EM: Classical Electrodynamics
It is elementary to solve these equations for the primed coordinates to get the reciprocal transform
x vt t (v / c 2 ) x
x' , y' y, z' z , t' . (9.6b)
1 v 2
/ c2
1/ 2
1 v 2
/ c2
1/ 2
(I will soon represent Eqs. (6) in a more elegant form – see Eqs. (19) below.)
The Lorentz transform relations (6) are evidently reduced to the Galilean transform formulas (2)
at v << c2. However, all attempts to give a reasonable interpretation of these equalities while keeping
2
the notion of the aether have failed, in particular because of the restrictions imposed by results of earlier
experiments carried out in 1851 and 1853 by Hippolyte Fizeau – which were repeated with higher
accuracy by the same Michelson and Morley in 1886. These experiments have shown that if one sticks
to the aether concept, this hypothetical medium has to be partially “dragged” by any moving dielectric
material with a speed proportional to ( – 1). Such local drag would be irreconcilable with the assumed
continuity of the aether.
In his famous 1905 paper, Albert Einstein suggested a bold resolution of this contradiction,
essentially removing the concept of the aether altogether.10 Moreover, he argued that the Lorentz
transform is the general property of time and space, rather than of the electromagnetic field alone. He
started with two postulates, the first one essentially repeating the relativity principle formulated a bit
earlier (in 1904) by H. Poincaré in the following form:
“…the laws of physical phenomena should be the same, whether for an observer fixed or for
an observer carried along in a uniform movement of translation; so that we have not and
could not have any means of discerning whether or not we are carried along in such a
motion.”11
The second Einstein postulate was that the speed of light c, in free space, should be constant in
all reference frames. (This is essentially a denial of the aether’s existence.)
Then, Einstein showed that the Lorenz transform relations (6) naturally follow from his
postulates, with a few (very natural) additional assumptions. Let a point source emit a short flash of
light, at the moment t = t’ = 0 when the origins of the reference frames shown in Fig. 1 coincide. Then,
according to the second of Einstein’s postulates, in each of the frames, the spherical wave propagates
with the same speed c, i.e. the coordinates of points of its front, measured in the two frames, have to
obey the following equalities:
(ct ) 2 ( x 2 y 2 z 2 ) 0,
(9.7)
(ct' ) 2 ( x' 2 y' 2 z' 2 ) 0.
10 In hindsight, this was much relief, because the aether had been a very awkward construct to start with. In
particular, according to the basic theory of elasticity (see, e.g., CM Ch. 7), in order to carry such transverse waves
as the electromagnetic ones, this medium would need to have a non-zero shear modulus, i.e. behave as an elastic
solid – rather than as a rarified gas hypothesized initially by C. Huygens.
11 Note that though the relativity principle excludes the notion of the special (“absolute”) spatial reference frame,
its quoted verbal formulation still leaves the possibility of the Galilean “absolute time” t = t’ open. The
quantitative relativity theory kills this option – see Eqs. (6) and their discussion below.
Chapter 9 Page 4 of 56
Essential Graduate Physics EM: Classical Electrodynamics
What may be the general relation between the combinations in the left-hand side of these equations –
not for this particular wave’s front, but in general? A very natural (essentially, the only justifiable)
choice is
(ct ) 2 ( x 2 y 2 z 2 ) f (v 2 ) (ct' ) 2 ( x' 2 y' 2 z' 2 ) .
(9.8)
Now, according to the first postulate, the same relation should be valid if we swap the reference frames
(x x’, etc.) and replace v with (–v). This is only possible if f 2 = 1, so excluding the option f = –1
(which is incompatible with the Galilean transform in the limit v/c 0), we are left with f = +1, i.e.
(ct ) 2 ( x 2 y 2 z 2 ) (ct' ) 2 ( x' 2 y' 2 z' 2 ) . (9.9)
For the line with y = y’ = 0 and z = z’ = 0, Eq. (9) is reduced to
(ct ) 2 x 2 (ct' ) 2 x' 2 . (9.10)
It is very illuminating to interpret this relation as the one resulting from a mutual rotation of the
reference frames (that now have to include clocks to measure time) on the plane of the coordinate x and
the so-called imaginary time ict – see Fig. 3.
'
x'
Fig. 9.3. The Lorentz transform as a mutual rotation
0 x of two reference frames on the [x, ] plane.
Chapter 9 Page 5 of 56
Essential Graduate Physics EM: Classical Electrodynamics
x
x ' 0 tan . (9.14)
These two expressions are compatible only if
iv , (9.15)
tan
c
so
tan iv / c 1 1
sin i , cos , (9.16)
1 tan
2 1/ 2
1 v 2
/c
2 1/ 2
1 tan
2 1/ 2
1 v 2
/ c2
1/ 2
where and are two very convenient and commonly used dimensionless parameters defined as
v 1 1 Relativistic
β , . (9.17) parameters
c 1 v 2
/ c2 1/ 2
1 2 1/ 2 and
(The vector is called the normalized velocity, while the scalar is the Lorentz factor.)12
Using the above relations for , Eqs. (12) become
x x' i' , ix' ' , (9.18a)
x' x i , ' i x . (9.18b)
Now returning to the real variables [x, ct], we get the Lorentz transform relations (6), in a more compact
form:
x x' ct' , y y' , z z' , ct ct' x' , (9.19a) Lorentz
transform
x' x ct , y' y, z' z , ct' ct x . (9.19b) – again
An immediate corollary of Eqs. (19) is that for to stay real, we need v2 c2, i.e. that the speed
of any physical body (to which we could connect a meaningful reference frame) cannot exceed the
speed of light, as measured in any other meaningful reference frame.13
12 Note the following identities: 2 1/(1- 2) and (2 – 1) 2/(1- 2) 22, which are frequently handy in
relativity-related algebra. One more function of , the rapidity tanh–1 (so that = i), is also useful for
some calculations.
13 All attempts to rationally conjecture particles moving with v > c (called tachyons) have failed – so far, at least.
Possibly the strongest objection against their existence is the fact that the tachyons could be used to communicate
back in time, thus violating the causality principle – see, e.g., G. Benford et al., Phys. Rev. D 2, 263 (1970).
Chapter 9 Page 6 of 56
Essential Graduate Physics EM: Classical Electrodynamics
has a clock and may use it to measure the instants of local events, taking place at the observer’s
location. He also conjectured, very reasonably, that:
(i) all observers within the same reference frame may agree on a common length measure (“a
scale”), i.e. on their relative positions in that frame, and synchronize their clocks,14 and
(ii) the observers belonging to different reference frames may agree on the nomenclature of
world events (e.g., short flashes of light) to which their respective measurements refer.
Actually, these additional postulates have been already implied in our “derivation” of the
Lorentz transform in Sec. 1. For example, by the set {x, y, z, t} we mean the results of space and time
measurements of a certain world event, about that all observers belonging to frame 0 agree. Similarly,
all observers of frame 0’ have to agree about the results {x’, y’, z’, t’}. Finally, when the origin of frame
0’ passes by some sequential points xk of frame 0, the observers in the latter frame may measure its
passage times tk without a fundamental error, and know that all these times belong to x’ = 0.
Now we can analyze the major corollaries of the Lorentz transform, which are rather striking
from the point of view of our everyday (rather non-relativistic) experience.
(i) Length contraction. Let us consider a thin rigid rod oriented along the x-axis, with its length l
x2 – x1, where x1,2 are the coordinates of the rod’s ends, as measured in its rest frame 0, at any instant t
(Fig. 4). What would be the rod’s length l’ measured by the Einstein observers in the moving frame 0’?
y
y'
x1 l x2
x
x'
0 0' v Fig. 9.4. The relativistic length contraction.
z z'
At a time instant t’ agreed upon in advance, the observers who find themselves exactly at the
rod’s ends, may register that fact, and then subtract their coordinates x’1,2 to calculate the apparent rod
length l’ x2’ – x1’ in the moving frame. According to Eq. (19a), l may be expressed via this l’ as
l x 2 x1 ( x 2' ct' ) ( x1' ct' ) ( x 2' x1' ) l' . (9.20a)
Hence, the rod’s length, as measured in the moving reference frame is
1/ 2
Length l v2
contraction l' l 1 2 l, (9.20b)
c
in accordance with the FitzGerald-Lorentz hypothesis (5). This is the relativistic length contraction
effect: an object is always the longest (has the so-called proper length l) if measured in its rest frame.
14 A posteriori, the Lorenz transform may be used to show that consensus-creating procedures (such as clock
synchronization) are indeed possible. The basic idea of the proof is that since at v << c, the relativistic corrections
to space and time intervals are of the order of (v/c)2, they have negligible effects on clocks being brought together
into the same point for synchronization slowly, with a speed u << c. The reader interested in a detailed discussion
of this and other fine points of special relativity may be referred to, e.g., either H. Arzeliès, Relativistic
Kinematics, Pergamon, 1966, or W. Rindler, Introduction to Special Relativity, 2nd ed., Oxford U. Press, 1991.
Chapter 9 Page 7 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Note that according to Eqs. (19), the length contraction takes place only in the direction of the relative
motion of two reference frames. As was noted in Sec. 1, this result immediately explains the zero result
of the Michelson-Morley-type experiments, so they give very convincing evidence (if not irrefutable
proof) of Eqs. (18)-(19).
(ii) Time dilation. Now let us use Eqs. (19a) to find the time interval t, as measured in some
reference frame 0, between two world events – say, two ticks of a clock moving with another frame 0’
(Fig. 5), i.e. having fixed values of x’, y’, and z’.
y y'
x x'
0 v Fig. 9.5. The relativistic time dilation.
z z'
Let the time interval between these two events, measured in the clock’s rest frame 0’, be t’ t2’
– t1’. At these two moments, the clock would fly by two Einstein’s observers at rest in frame 0, so they
can record the corresponding moments t1,2 shown by their clocks, and then calculate t as their
difference. According to the last of Eqs. (19a),
cΔt ct 2 ct1 (ct 2' x' ) (ct1' x' ) cΔt' , (9.21a)
so, finally,
Δt' Time
Δt Δt' Δt' . (9.21b)
1 v
dilation
2 2 1/ 2
/c
This is the famous relativistic time dilation (or “dilatation”) effect: a time interval is longer if measured
in a frame (in our case, frame 0) moving relative to the clock, while that in the clock’s rest frame is the
shortest possible – the so-called proper time interval.
This rather counter-intuitive effect is the everyday reality in experiments with high-energy
elementary particles. For example, in a typical (and by no means record-breaking) experiment carried
out in Fermilab, a beam of charged 200 GeV pions with 1,400 traveled a distance of l = 300 m with
the measured loss of only 3% of the initial beam intensity due to the pion decay (mostly, into muon-
neutrino pairs) with the proper lifetime t0 2.5610-8 s. Without the time dilation, only an exp{-l/ct0}
~10-17 fraction of the initial pions would survive, while the relativity-corrected number, exp{-l/ct} =
exp{-l/ct0} 0.97, was in full accordance with experimental measurements.
As another example, the global positioning systems (say, the GPS) are designed with the account
of the time dilation due to the velocity of their satellites (and also some gravity-induced, i.e. general-
relativity corrections, which I would not have time to discuss) and would give large errors without such
corrections. So, there is no doubt that time dilation (21) is a reality, though the precision of its
experimental tests I am aware of15 has been limited to a few percent, because of the almost unavoidable
involvement of less controllable gravity effects – which provide a time interval change of the opposite
sign in most experiments near the Earth’s surface.
Chapter 9 Page 8 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Before the first reliable observation of time dilation (by B. Rossi and D. Hall in 1940), there had
been serious doubts about the reality of this effect, the most famous being the twin paradox first posed
(together with an immediate suggestion of its resolution) by P. Langevin in 1911. Let us send one of two
twins on a long space roundtrip with the maximum speed approaching c. Upon his return to Earth, who
of the twins would be older? The naïve approach is to say that due to the relativity principle, not one can
be (and hence there is no time dilation) because each twin could claim that their counterpart rather than
them, was moving, with the same speed but in the opposite direction. The resolution of the paradox is
that one of the twins had to be accelerated to be brought back, and hence the reference frames have to be
dissimilar: only one of them may stay inertial all the time. As a result, the twin who had been
accelerated (“actually traveling”) would be younger than their sibling when they finally came together.
Constructive proof of this conclusion for the particular case of straight-line travel with a piecewise-
constant acceleration, is simple and hence left for the reader’s exercise.
(iii) Velocity transformation. Now let us calculate the velocity u of a moving point, as observed
in reference frame 0, provided that its velocity, as measured in frame 0’, is u’ (Fig. 6).
y y' u'
u
0 0' v
x x'
z z' Fig. 9.6. The relativistic velocity addition.
Keeping the usual definition of velocity, but with due attention to the relativity of not only
spatial but also temporal intervals, we may write
dr dr'
u , u' . (9.22)
dt dt'
Plugging in the differentials of the Lorentz transform relations (6a) into these definitions, we get
dx dx' vdt' u' x v dy 1 dy' 1 u' y
ux , uy , (9.23)
dt dt' vdx' / c 2
1 u' x v / c 2 dt dt' vdx' / c 2 1 u' x v / c 2
with a similar formula for uz. In the classical limit v/c 0, these relations are reduced to
u x u' x v, u y u' y , u z u' z , (9.24a)
In order to see how unusual the full relativistic rules (23) are at u ~ c, let us first consider a
purely longitudinal motion, uy = uz = 0; then16
16 With an account of the identity tanh(a + b) = (tanha + tanhb)/(1 + tanha tanhb), which readily follows from
MA Eq. (3.5), Eq. (25) shows that rapidities tanh-1 add up exactly as longitudinal velocities at non-
relativistic motion, making that notion very convenient for the analysis of transfer between several frames.
Chapter 9 Page 9 of 56
Essential Graduate Physics EM: Classical Electrodynamics
u' v Longitudinal
u , (9.25) velocity
1 u'v / c 2 addition
where u ux and u’ u’x. Figure 7 shows this u as the function of u’, for several values of the reference
frames’ relative velocity v.
v / c 0 .9
0 .5
0
u 0 .5
0
c 0 .9
1
Fig. 9.7. The addition of longitudinal velocities.
1 0 1
u' / c
The first sanity check is that if v = 0, i.e. if the reference frames are at rest relative to each other,
then u = u’, as it should be – see the diagonal straight line in Fig. 7. Next, if magnitudes of u’ and v are
both below c, so is the magnitude of u. (Also good, because otherwise, ordinary particles in one frame
would be tachyons in the other one, and the theory would be in big trouble.) Now strange things begin:
even as u’ and v are both approaching c, then u is also close to c, but does not exceed it. As an example,
if we fired forward a bullet with the relative speed of 0.9c, from a spaceship moving from the Earth also
at 0.9c, Eq. (25) predicts the speed of the bullet relative to the Earth to be just [(0.9 + 0.9)/(1 +
0.90.9)]c 0.994c < c, rather than (0.9 + 0.9) c = 1.8 c > c as in the Galilean kinematics. Actually, we
could expect this strangeness, because it is necessary to fulfill the 2nd Einstein’s postulate: the
independence of the speed of light in any reference frame. Indeed, for u’ = c, Eq. (25) yields u = c,
regardless of v.
In the opposite case of a purely transverse motion, when a point moves across the relative
motion of the frames (for example, at our choice of coordinates, u’ x = u’ z = 0), Eqs. (23) yield a much
less spectacular result
1
u y u' y u' y . (9.26)
This effect comes purely from the time dilation because the transverse spatial intervals are Lorentz-
invariant.
In the case when both u x’ and uy’ are substantial (but uz’ is still zero), we may divide Eqs. (23)
by each other to relate the angles of the point’s propagation, as observed in the two reference frames:
uy u' y sin ' Stellar
tan . (9.27) aberration
ux u' x v cos ' v / u' effect
Chapter 9 Page 10 of 56
Essential Graduate Physics EM: Classical Electrodynamics
This expression describes, in particular, the so-called stellar aberration effect: the dependence of the
observed direction toward a star on the speed v of the telescope’s motion relative to the star – see Fig.
8. (The effect is readily observable experimentally as the annual aberration due to the periodic change
of speed v by 2vE 60 km/s because of the Earth’s rotation about the Sun. Since the aberration’s main
part is of the first order in vE/c ~ 10-4, this effect is very significant and has been known since the early
1700s.)
u' (u' c )
'
v
Fig. 9.8. The stellar aberration.
For the analysis of this effect, it is sufficient to take, in Eq. (27), u’ = c, i.e. v/u’ = , and
interpret ’ as the “proper” direction to the star, which would be measured at v = 0.17 At << 1, both
Eq. (27) and the Galilean result (which the reader is invited to derive directly from Fig. 8),
sin '
tan , (9.28)
cos '
may be well approximated by the first-order term
' sin ' . (9.29)
Unfortunately, it is not easy to use the difference between Eqs. (27) and (28), of the second order in ,
for special relativity’s confirmation, because other components of the Earth’s motion, such as its
rotation, nutation, and torque-induced precession,18 give masking first-order contributions to the
aberration.
Finally, for a completely arbitrary direction of the vector u’, Eqs. (22) may be readily used to
calculate the velocity’s magnitude. The most popular form of the resulting expression is the following
expression for the square of the relative velocity (or rather the reduced relative velocity ) of two points,
β1 β 2 2 β1 β 2
2
1. (9.30)
1 β1 β 2 2
where 1,2 v1,2/c are their normalized velocities as measured in the same reference frame.
17 Strictly speaking, to reconcile the geometries shown in Fig. 1 (for which all our formulas, including Eq. (27),
are valid) and Fig. 8 (giving the traditional scheme of the stellar aberration), it is necessary to invert the signs of u
(and hence of sin’ and cos’) and v, but as it is evident from Eq. (27), all the minus signs cancel, and the formula
is valid “as is”.
18 See, e.g., CM Secs. 4.4-4.5.
Chapter 9 Page 11 of 56
Essential Graduate Physics EM: Classical Electrodynamics
(iv) The Doppler effect. Let us consider a monochromatic plane wave of some physical nature,
traveling along the x-axis:
f Re f exp i (kx t f coskx t arg f f cos . (9.31)
Its total phase, kx – t + arg f (in contrast to its amplitude f– see Sec. 5 below) cannot depend
on the observer’s reference frame, because the variable f vanishes completely at = (n + ½) (for all
integer n), and such “world events” should be observable in all reference frames. The only way to keep
= ’ at all times is to have19
kx t k'x' 't' . (9.32)
First, let us use this general relation to consider the Doppler effect in the usual non-relativistic
mechanical waves, e.g., oscillations of particles of a certain medium. Using the Galilean transform (2),
we may rewrite Eq. (32) as
k ( x' vt ) t k'x' 't . (9.33)
Since this transform leaves all space intervals (including the wavelength = 2/k) intact, we can take k
= k’, so Eq. (33) yields
' kv . (9.34)
For a dispersion-free medium, the wave number k is the ratio of its frequency , as measured in
the reference frame bound to the medium, and the wave velocity vw. In particular, if the wave source
rests in the medium, we may bind the reference frame 0 to the medium as well, and frame 0’ to the
wave’s receiver (i.e. v = vr), so
k , (9.35)
vw
and for the frequency perceived by the receiver, Eq. (34) yields
v w vr
' . (9.36)
vw
On the other hand, if the receiver and the medium are at rest in the reference frame 0’, while the wave
source is bound to the frame 0 (so v = –vs), Eq. (35) should be replaced with
'
k k' , (9.37)
vw
and Eq. (34) yields a different result:
vw
' , (9.38)
v w vs
Finally, if both the source and detector are moving, it is straightforward to combine these two results to
get the general relation
v v
' w r . (9.39)
v w vs
19 Strictly speaking, Eq. (32) is valid to an additive constant, but for notation simplicity, it may be always made
equal to zero by selecting (as has already been done in all relations of Sec. 1) the reference frame origins and/or
clock turn-on times so that at t = 0 and x = 0, t’ = 0 and x’ = 0 as well.
Chapter 9 Page 12 of 56
Essential Graduate Physics EM: Classical Electrodynamics
At low speeds of both the source and the receiver, this result simplifies,
v r vs
' 1 , with , (9.40)
vw
but at speeds comparable to vw we have to use the more general Eq. (39). Thus, the usual Doppler effect
is generally affected not only by the relative speed (vr – vs) of the wave’s source and detector but also by
their speeds relative to the medium in which the waves propagate.
Somewhat counter-intuitively, for the electromagnetic waves the calculations are simpler
because for them the propagation medium (aether) does not exist, the wave velocity equals c in any
reference frame, and there are no two separate cases: we can always take k = /c and k’ = ’/c.
Plugging these relations, together with the Lorentz transform (19a), into the phase-invariance condition
(32), we get
ct' x' '
( x' ct' ) x' 't' . (9.41)
c c c
This relation has to hold for any x’ and t’, so we may require that the net coefficients before these
variables vanish. These two requirements yield the same equality:
' (1 ) . (9.42)
This result is already quite simple, but may be transformed further to be even more illuminating:
1 1
1/ 2
1
' . (9.43)
1
2 1/ 2
1 1
At any sign before , one pair of parentheses cancels, so20
1/ 2
Longitudinal
1
Doppler ' . (9.44)
1
effect
Thus the Doppler effect for electromagnetic waves depends only on the relative velocity v = c
between the wave source and detector – as it should be, given the aether’s absence. At velocities much
lower than c, Eq. (44) may be approximated as
1 / 2
' 1 , (9.45)
1 / 2
i.e. in the first approximation in v/c, it tends to the corresponding limit (40) of the usual Doppler
effect.
If the wave vector k is tilted by angle to the vector v (as measured in frame 0), then we have to
repeat the calculations, with k replaced by kx, and components ky and kz left intact at the Lorentz
transform. As a result, Eq. (42) is generalized as
20 It may look like the reciprocal expression of via ’ is different, violating the relativity principle. However, in
this case, we have to change the sign of , because the relative velocity of the system is opposite, so we return to
Eq. (44) again.
Chapter 9 Page 13 of 56
Essential Graduate Physics EM: Classical Electrodynamics
This is the transverse Doppler effect – which is absent in non-relativistic physics. Its first
experimental evidence was obtained using electron beams (as had been suggested in 1906 by J. Stark),
by H. Ives and G. Stilwell in 1938 and 1941. Later, similar experiments were repeated several times, but
the first unambiguous measurements were performed only in 1979 by D. Hasselkamp et al. who
confirmed Eq. (47) with a relative accuracy of about 10%. This precision may not look too spectacular,
but besides the special tests discussed above, the Lorentz transform formulas have been also confirmed,
less directly, by a huge body of other experimental data, especially in high energy physics, agreeing
with calculations incorporating this transform as their part. This is why, with due respect to the spirit of
challenging authority, I should warn the reader: if you decide to challenge the relativity theory (called
“theory” by tradition only), you would also need to explain all these data. Best luck with that! 21
where Ljj’ are the elements of the following 44 Lorentz transform matrix
0 0
0 0 Lorentz
0 . (9.51) transform
0 1 0 matrix
0 0 0 1
Since such 4-vectors are a new notion for this course and will be used for many more purposes
than just the space-time transform, we need to discuss the general mathematical rules they obey. Indeed,
21 The same fact, ignored by crackpots, is also valid for other favorite directions of their attacks, including the
Universe expansion, quantum measurement uncertainty, and entropy growth in physics, and the evolution theory
in biology.
Chapter 9 Page 14 of 56
Essential Graduate Physics EM: Classical Electrodynamics
as was already mentioned in Sec. 8.9, the usual (three-component) vector is not just any ordered set
(string) of three scalars {Ax, Ay, Az}; if we want it to represent a reference-frame-independent physical
reality, the vector’s components have to obey certain rules at the transfer from one reference frame to
another. In particular, in the non-relativistic limit the vector’s norm (its magnitude squared),
should be invariant with respect to the transfer between different reference frames. However, a naïve
extension of this approach to 4-vectors would not work, because, according to the calculations of Sec. 1,
the Lorentz transform keeps intact the combinations of the type (7), with one sign negative, rather than
the sum of all components squared. Hence for the 4-vectors, all the rules of the game have to be
reviewed and adjusted – or rather redefined from the very beginning, for example as follows.22
An arbitrary 4-vector is a string of 4 scalars,23
General
4-vector A0 , A1 , A2 , A3 , (9.53)
whose components Aj, as measured in the reference frames 0 and 0’ shown in Fig. 1, obey the Lorentz
transform relations similar to Eq. (50):
Lorentz 3
transform: A j L jj ' A' j' . (9.54)
general
j ' 0
4-vector
As we have already seen in the example of the space-time 4-vector (48), this means in particular that
This is the so-called Lorentz invariance condition for the 4-vector’s norm. (The difference
between this relation and Eq. (52), pertaining to Euclidian geometry, is the reason why the Minkowski
space is called pseudo-Euclidian.) It is also straightforward to use Eqs. (51) and (54) to check that the
evident generalization of the norm, the scalar product of two arbitrary 4-vectors,
Scalar 3
4-product A0 B0 A j B j , (9.56)
j 1
is also Lorentz-invariant.
Now consider the 4-vector corresponding to a small interval between two close world events:
{dx0 , dx1 , dx 2 , dx3 } cdt , dr ; (9.57)
its norm,
3
Interval (ds ) 2 dx02 dx 2j c 2 (dt ) 2 (dr ) 2 , (9.58)
j 1
22 The most prominent alternative, which has both advantages and drawbacks, is to use 4-vectors with one
imaginary component – for example, the imaginary time ict instead of the real product ct in Eq. (48).
23 Such vectors are said to reside in so-called 4D Minkowski spaces – called after Hermann Minkowski who was
the first one to recast (in 1907) the special relativity relations in a form in which the spatial coordinates and time
(or rather ct) are treated on an equal footing.
Chapter 9 Page 15 of 56
Essential Graduate Physics EM: Classical Electrodynamics
is of course also Lorentz-invariant. Since the speed of any particle (or signal) cannot be larger than c, for
any pair of world events that are in a causal relation with each other, (dr)2 cannot be larger than (cdt)2,
i.e. such time-like interval (ds)2 cannot be negative. The 4D surface separating such intervals from
space-like intervals (ds)2 < 0 is called the light cone (Fig. 9).
Now let us consider two close world events that happen with the same point moving with
velocity u. Then in the frame moving with the point (v = u), the last term on the right-hand side of Eq.
(58) equals zero, while the involved time is the proper one, so
ds cd , (9.59)
where d is the proper time interval. But according to Eq. (21), this means that we can write
dt
d , (9.60)
where dt is the time interval in an arbitrary (besides being inertial) reference frame, while
u 1 1
β and (9.61)
c 1 2 1/ 2
1 u 2
/ c2
1/ 2
are the parameters (17) corresponding to the point’s velocity (u) in that frame, so ds = cdt/.24
Let us use Eq. (60) to explore whether a 4-vector may be formed using the spatial Cartesian
components of the point’s velocity
dx dy dz
u , , . (9.62)
dt dt dt
Here we have a problem: per Eqs. (22), these components do not obey the Lorentz transform. However,
let us use d dt/, the proper time interval of the point, to form the following string:
24I have opted against using special indices (e.g., u and u) to distinguish Eqs. (17) and (61) here and below, in a
hope that the suitable velocity (of either a reference frame or a particle) will be always clear from the context.
Chapter 9 Page 16 of 56
Essential Graduate Physics EM: Classical Electrodynamics
As it follows from the comparison of the middle form of this expression with Eq. (48), since the time-
space vector obeys the Lorentz transform, and is Lorentz-invariant, the string (63) is a legitimate 4-
vector; it is called the 4-velocity of a point – or of a point particle.
Now we are well equipped to proceed to relativistic dynamics. Let us start with such basic
notions as the momentum p and the energy E – so far, for a free particle.25 Perhaps the most elegant way
to “derive” (or rather guess26) the expressions for p and E as functions of the particle’s velocity u, is
based on analytical mechanics. Due to the conservation of v, the trajectory of a free particle in the 4D
Minkowski space {ct, r} is always a straight line. Hence, from the Hamilton principle,27 we may expect
its action S, between points 1 and 2, to be a linear function of the space-time interval (59):
2 2 t2
Free dt
particle: S ds c d c , (9.64)
action
1 1 t1
where is some constant. On the other hand, in analytical mechanics, the action is defined as
t2
S Ldt , (9.65)
t1
where L is the particle’s Lagrangian function.28 Comparing these two expressions, we get
1/ 2
c u2
L c 1 2 . (9.66)
c
In the non-relativistic limit (u << c), this function tends to
u2 u 2
L c1 c . (9.67)
2c 2 2 c
In order to correspond to the Newtonian mechanics,29 the last (velocity-dependent) term should equal
mu2/2. From here we find = –mc, so, finally,
Free 1/ 2
particle: u2 mc 2
Lagrangian L mc 1 2
2
. (9.68)
function c
Now we can find the Cartesian components pj of the particle’s momentum as the generalized
momenta corresponding to the corresponding components rj (j = 1, 2, 3) of the 3D radius-vector r:30
25 I am sorry for using, just as in Sec. 6.3, the same traditional notation (p) for the particle’s momentum as had
been used earlier for the electric dipole moment. However, since the latter notion will be virtually unused in the
balance of this course, this may hardly lead to confusion.
26 Indeed, such a derivation uses additional assumptions, however natural (such as the Lorentz-invariance of S),
i.e. it can hardly be considered as a real proof of the final results, so they require experimental confirmation.
Fortunately, such confirmations have been numerous – see below.
27 See, e.g., CM Sec. 10.3.
28 See, e.g., CM Sec. 2.1.
29 See, e.g., CM Eq. (2.19b).
30 See, e.g., CM Sec. 2.3, in particular Eq. (2.31).
Chapter 9 Page 17 of 56
Essential Graduate Physics EM: Classical Electrodynamics
1/ 2
2
L L u12 u 22 u 32 mu j
pj mc 1 m u j . (9.69)
rj u j u j c2 1 u 2
/ c2
1/ 2
Thus for the 3D vector of momentum, we can write the result in the same form as in non-relativistic
mechanics,
p m u Mu , (9.70) Relativistic
momentum
m being the non-relativistic mass of the particle. (More often, m is called the rest mass, because in the
reference frame in which the particle rests, Eq. (71) yields M = m.)
Next, let us return to analytical mechanics to calculate the particle’s energy E (which for a free
particle coincides with its Hamiltonian function H):31
1/ 2
3
mu 2 u2 mc 2
E H p ju j L p u L mc 1 2 2
. (9.72)
j 1 1 u 2
/ c2
1/ 2
c 1 u 2
/ c2
1/ 2
Thus, we have arrived at the most famous of Einstein’s formulas – and probably of physics as a whole:
E m c 2 Mc 2 , (9.73) E = Mc2
which expresses the relation between the free particle’s mass and its energy.32 In the non-relativistic
limit, it reduces to
mc 2 2 u2 mu 2
E
mc 1 2 mc
2
, (9.74)
1 u 2 / c 2 1 / 2 2c 2
the first term mc2 being called the rest energy of a particle.
Now let us consider the following string of 4 scalars:
E E 4-vector of
, p1 , p 2 , p 3 , p . (9.75) energy-
c c momentum
Chapter 9 Page 18 of 56
Essential Graduate Physics EM: Classical Electrodynamics
and comparing the result with Eq. (63), we immediately see that, since m is a Lorentz-invariant constant,
this string is a legitimate 4-vector of energy-momentum. As a result, its norm,
2
E
p ,
2
(9.77a)
c
is Lorentz-invariant, and in particular, has to be equal to the norm in the particle-bound frame. But in
that frame, p = 0, and according to Eq. (73), E = mc2, and the norm is just
2 2
E mc 2 (9.77b)
mc 2 ,
c c
so in an arbitrary frame
2
E
p (mc ) .
2 2
(9.78a)
c
This very important relation33 between the relativistic energy and momentum (valid for free particles
only!) is usually represented in the form34
Free
E 2 mc 2 pc 2 .
2
particle: (9.78b)
energy
According to Eq. (70), in the so-called ultra-relativistic limit u c, p tends to infinity, while
mc2 stays constant, so pc/mc2 . As follows from Eq. (78), in this limit E pc. Though the above
discussion was for particles with finite m, the 4-vector formalism allows us to consider compact objects
with zero rest mass as ultra-relativistic particles for which the above energy-to-moment relation,
E pc , for m 0 , (9.79)
is exact. Quantum electrodynamics35 tells us that under certain conditions, the electromagnetic field
quanta (photons) may be also considered as such massless particles with momentum p = k. Plugging
(the modulus of) the last relation into Eq. (78), for the photon’s energy we get E = pc = kc = . Please
note again that according to Eq. (73), the relativistic mass of a photon is not equal to zero: M = E/c2 =
/c2, so the term “massless particle” has a limited meaning: m = 0. For example, the relativistic mass
of an optical phonon is of the order of 10–36 kg. On the human scale, this is not too much, but still, a
noticeable (approximately one-millionth) part of the rest mass me of an electron.
The fundamental relations (70) and (73) have been repeatedly verified in numerous particle
collision experiments, in which the total energy and momentum of a system of particles are conserved –
at the same conditions as in non-relativistic dynamics. (For the momentum, this is the absence of
external forces, and for the energy, the elasticity of particle interactions – in other words, the absence of
alternative channels of energy escape.) Of course, generally only the total energy of the system is
conserved, including the potential energy of particle interactions. However, at typical high-energy
33 Please note one more simple and useful relation following from Eqs. (70) and (73): p = (E/c2)u.
34 It may be tempting to interpret this relation as the perpendicular-vector-like addition of the rest energy mc2 and
the “kinetic energy” pc, but from the point of view of the total energy conservation (see below), a better definition
of the kinetic energy is T(u) E(u) – E (0).
35 It is briefly reviewed in QM Chapter 9.
Chapter 9 Page 19 of 56
Essential Graduate Physics EM: Classical Electrodynamics
particle collisions, the potential energy vanishes so rapidly with the distance between them that we can
use the momentum and energy conservation laws using Eq. (73).
As an example, let us calculate the minimum energy Emin of a proton (pa), necessary for the well-
known high-energy reaction that generates a new proton-antiproton pair, pa + pb p + p + p + p ,
provided that before the collision, proton pb had been at rest in the lab frame. This minimum
corresponds to the vanishing relative velocity of the reaction products, i.e. their motion with virtually
the same velocity (ufin), as seen from the lab frame – see Fig. 10.
Due to the momentum conservation, this velocity should have the same direction as the initial
velocity (umin) of proton pa. This is why two scalar equations: for energy conservation,
mc 2 4mc 2
mc 2 , (9.80a)
1 u 2
min / c2
1/ 2
1 u 2
fin / c2
1/ 2
are sufficient to find both umin and ufin. After a rather tedious solution of this system of two nonlinear
equations, we get
4 3 3
u min c 0.990 c, u fin c 0.866 c . (9.81)
7 2
Finally, we can use Eq. (72) to calculate the required energy; the result is Emin = 7 mc2. (Note that at this
threshold, only a minor 2mc2 part of the kinetic energy Tmin = Emin – mc2 = 6mc2 of the initially moving
particle, goes into the “useful” proton-antiproton pair production.) The proton’s rest mass, mp 1.6710-
27
kg, corresponds to mpc2 1.50210-10 J 0.938 GeV, so Emin 6.57 GeV.
The second, more intelligent way to solve the same problem is to use the center-of-mass (c.o.m.)
reference frame that, in relativity, is defined as the frame in which the total momentum of the system
vanishes.36 In this frame, at E = Emin, the velocity and momenta of all reaction products are vanishing,
while the velocities of the protons pa and pb before the collision are equal and opposite, with an initially
unknown magnitude u’. Hence the energy conservation law becomes
2mc 2
4mc 2 , (9.82)
1 u' 2
/c
2 1/ 2
Chapter 9 Page 20 of 56
Essential Graduate Physics EM: Classical Electrodynamics
readily giving u’/c = 3/2. (This is of course the same result as Eq. (81) gives for ufin.) Now we can use
the fact that the velocity of the proton pa in the c.o.m. frame is (–u’), to find its lab-frame speed, using
the velocity transform (25):
2u'
u min . (9.83)
1 u' 2 / c 2
With the above result for u’, this relation gives the same result as the first method, umin/c = 43/7, but in
a simpler way.
A A A A A02 A 2 . (9.86)
Note that the first and the second expressions may be understood as sums over four components of the
product, with the summation sign dropped.37 The scalar product (86) is just the norm of the 4-vector in
our former definition, and as we already know, is Lorentz-invariant. Moreover, the scalar product of two
different vectors (also a Lorentz invariant), may be rewritten in any of two similar forms:38
Scalar
product's A0 B0 A B A B A B ; (9.87)
forms
again, the only caveat is to take one vector in the covariant, and the other one in the contravariant form.
Now let us return to our sample problem (Fig. 10). Since all components (E/c and p) of the total
4-momentum of our system are conserved at the collision, its norm is conserved as well:
p a pb p a pb (4 p) (4 p) . (9.88)
37 This compact notation may take some time to be accustomed to, but is very convenient (compact) and can
hardly lead to any confusion, due to the following rule: the summation is implied when, and only when the same
index is repeated twice, once on the top and another at the bottom. (It is frequently called dummy index, because
its notation may be replaced with any other letter not used in the same formula.) In this course, this shorthand
notation will be used only for 4-vectors, but not for the usual (3D spatial) vectors.
38 Note also that, by definition, for any two 4-vectors, A B = BA .
Chapter 9 Page 21 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Since now the vector product is the usual math construct, we know that the parentheses on the left-hand
side of this equation may be multiplied as usual. We may also swap the operands and move constant
factors through products as convenient. As a result, we get
p a p a pb pb 2 p a pb 16 p p .
(9.89)
Thanks to the Lorentz invariance of each of the terms, we may calculate it in the reference frame
we like. For the first two terms on the left-hand side, as well as for the right-hand side term, it is
beneficial to use the frames in which that particular proton is at rest; as a result, according to Eq. (77b),
each of the two left-hand-side terms equals (mc)2, while the right-hand side equals 16(mc)2. On the
contrary, the last term on the left-hand side is more easily evaluated in the lab frame, because in it, the
three spatial components of the 4-momentum pb vanish, and the scalar product is just the product of the
scalars E/c for protons a and b. For the latter proton, being at rest, this ratio is just mc so we get a simple
equation,
Emin
(mc) 2 (mc) 2 2 mc 16(mc) 2 , (9.90)
c
immediately giving the final result Emin = 7 mc2, already obtained earlier in two more complex ways.
Let me hope that this example was a convincing demonstration of the convenience of
representing 4-vectors in the contravariant (84) and covariant (85) forms,39 with Lorentz-invariant
norms (86). To be useful for more complex tasks, this formalism should be developed a little bit further.
In particular, it is crucial to know how the 4-vectors change under the Lorentz transform. For
contravariant vectors, we already know the answer (54); let us rewrite it in our new notation:
Lorentz
transform:
A L A' . (9.91) contravariant
vectors
where L is the matrix (51), generally called the mixed Lorentz tensor:40
0 0
0 0 Mixed
L , (9.92) Lorentz
0 0 1 0 tensor
0 0 0 1
Note that though the position of the indices and in the Lorentz tensor notation is not crucial, because
this tensor is symmetric, it is convenient to place them using the general index balance rule: the
difference of the numbers of the upper and lower indices should be the same in both parts of any 4-
vector/tensor equality. (You may check that all the formulas above do satisfy this rule.)
39 These forms are 4-vector extensions of the notions of contravariance and covariance, introduced in the 1850s
by J. Sylvester (who also introduced the term “matrix” in its mathematical sense) for the description of the change
of the usual 3-component spatial vectors at the transfer between different reference frames – e.g., resulting from
the frame rotation. In this case, the contravariance or covariance of a vector is uniquely determined by its nature:
if the Cartesian coordinates of a vector (such as the non-relativistic velocity v = dr/dt) are transformed similarly to
the radius-vector r, it is called contravariant, while the vectors (such as f ) that require the reciprocal transform,
are called covariant. In the 4D Minkowski space, both forms may be used for any 4-vector.
40 Just as the 4-vectors, 4-tensors with two top indices are called contravariant, and those with two bottom indices,
are covariant. The tensors with one top and one bottom index are called mixed.
Chapter 9 Page 22 of 56
Essential Graduate Physics EM: Classical Electrodynamics
In order to rewrite Eq. (91) in a more general form that would not depend on the particular
orientation of the coordinate axes (Fig. 1), let us use the contravariant and covariant forms of the 4-
vector of the time-space interval (57),
41Another way to write this relation is (ds)2 = g dxdx = gdx dx, where double summation over indices
and is implied, and g is the so-called metric tensor,
1 0 0 0
0 1 0 0
g g ,
0 0 1 0
0 0 0 1
which may be used, in particular, to transfer a covariant vector into the corresponding contravariant one
and back: A = gA, A = g A. The metric tensor plays a key role in general relativity, in which it is
affected by gravity – “curved” by particles’ masses.
42 Note that in the index balance rule, the top index in the denominator of a fraction is counted as a bottom index
in the numerator, and vice versa.
Chapter 9 Page 23 of 56
Essential Graduate Physics EM: Classical Electrodynamics
0 0
x' 0 0 , (9.98)
x 0 0 1 0
0 0 0 1
Since according to Eqs. (84)-(85), covariant 4-vectors differ from the contravariant ones by the sign of
their spatial components, their direct transform is given by matrix (98). Hence their direct and reciprocal
transforms may be represented, respectively, as
Lorentz
x' x transform:
A A' , A' A , (9.99) covariant
x x' vectors
evidently satisfying the index balance rule. (Note that primed quantities are now multiplied, rather than
divided as in the contravariant case.) As a sanity check, let us apply this formalism to the scalar product
AA. As Eq. (96) shows, the implicit-sum notation allows us to multiply and divide any equality by the
same partial differential of a coordinate, so we can write:
x' x x'
A A
A' A'
A' A' A' A' A' A' , (9.100)
x x' x'
i.e. the scalar product AA (as well as AA) is Lorentz-invariant, as it should be.
Now, let us consider the 4-vectors of derivatives. Here we should be very careful. Consider, for
example, the following 4-vector operator
, , (9.101)
x (ct )
As was discussed above, the operator is not changed by its multiplication and division by another
differential, e.g., x’ (with the corresponding implied summation over all four values of ), so
x'
. (9.102)
x x x'
But, according to the first of Eqs. (99), this is exactly how the covariant vectors are Lorentz-
transformed! Hence, we have to consider the derivative over a contravariant space-time interval as a
covariant 4-vector, and vice versa.43 (This result might be also expected from the index balance rule.) In
particular, this means that the scalar product
A0
A A (9.103)
x (ct )
should be Lorentz-invariant for any legitimate 4-vector. A convenient shorthand for the covariant
derivative, which complies with the index balance rule, is
, (9.104)
x
43 As was mentioned above, this is also a property of the reference-frame transform of the “usual” 3D vectors.
Chapter 9 Page 24 of 56
Essential Graduate Physics EM: Classical Electrodynamics
so the invariant scalar product may be written just as A. A similar definition of the contravariant
derivative,
, , (9.105)
x (ct )
allows us to write the Lorentz-invariant scalar product (103) in any of the following two forms:
A0
A A A . (9.106)
(ct )
Finally, let us see how the general Lorentz transform changes 4-tensors. A second-rank 44
matrix is a legitimate 4-tensor if the 4-vectors it relates obey the Lorentz transform. For example, if two
legitimate 4-vectors are related as
A T B , (9.107)
we should require that
A' T' B' , (9.108)
where A and A’ are related by Eqs. (97), while B and B’, by Eqs. (99). This requirement immediately
yields
x x x' x'
Lorentz
transform T T' , T'
T , (9.109)
of 4-tensors x' x' x x
with the implied summation over two indices, and . The rules for the covariant and mixed tensors are
similar.44
4-vector
of electric j c, , j c, j, (9.111)
current
then Eq. (110) may be represented in the form
Continuity
equation: j j 0 , (9.112)
4-form
showing that the continuity equation is form-invariant45 with respect to the Lorentz transform.
44 It is straightforward to check that transfer between the contravariant and covariant forms of the same tensor
may be readily achieved using the metric tensor g: T = gT g, T = gT g.
45 In some texts, the formulas preserving their form at a transform are called “covariant”, creating a possibility for
confusion with the covariant vectors and tensors. On the other hand, calling such formulas “invariant” would not
distinguish them properly from invariant quantities, such as the scalar products of 4-vectors.
Chapter 9 Page 25 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Of course, such a form-invariance of a relation does not mean that all component values of the 4-
vectors participating in it are the same in both frames. For example, let us have some static charge
density in frame 0; then Eq. (97b), applied to the contravariant form of the 4-vector (111), reads
x'
j' j , with j c, 0, 0, 0 . (9.113)
x
Using the particular form (98) of the reciprocal Lorentz matrix for the coordinate choice shown in Fig.
1, we see that this relation yields
Lorentz
' , j' x c v , j' y j' z 0 . (9.114) transforms
of and j
Since the charge velocity, as observed from frame 0’, is (–v), the non-relativistic results would be ’ =
, j’ = –v. The additional factor in the relativistic results is caused by the length contraction: dx’ =
dx/, so to keep the total charge dQ = d3r = dxdydz inside the elementary volume d3r = dxdydz intact,
(and hence jx) should increase proportionally.
Next, at the end of Chapter 6 we have seen that Maxwell equations for the electromagnetic
potentials and A may be represented in similar forms (6.118), under the Lorenz (again, not “Lorentz”,
please!) gauge condition (6.117). For free space, this condition takes the form
1
A 0. (9.115)
c 2 t
This expression gives us a hint of how to form the 4-vector of electromagnetic potentials:46
4-vector
A , A , A , A ; (9.116) of potentials
c c
indeed, this vector satisfies Eq. (115) in its 4-form:
Lorenz
A A 0 . (9.117) gauge:
4-form
Since this scalar product is Lorentz-invariant, and the derivatives (104)-(105) are legitimate 4-
vectors, this implies that the 4-vector (116) is also legitimate, i.e. obeys the Lorentz transform formulas
(97), (99). Even more convincing evidence of this fact may be obtained from the Maxwell equations
(6.118) for the potentials. In free space, they may be rewritten as
2 c 2
2 0 c , 2 A 0 j . (9.118)
(ct ) c 0c (ct )
2 2 2
Using the definition (116), these equations may be merged to one:47
Maxwell
equation
A 0 j , (9.119) for
4-potential
where is the d’Alembert operator,48 which may be represented as either of two scalar products:
46 In the Gaussian units, the scalar potential should not be divided by c in this relation.
47 In the Gaussian units, the coefficient 0 in Eq. (119) should be replaced, as usual, with 4/c.
Chapter 9 Page 26 of 56
Essential Graduate Physics EM: Classical Electrodynamics
D’Alembert 2
operator 2 , (9.120)
(ct ) 2
and hence is Lorentz-invariant. Because of that, and the fact that the Lorentz transform changes both 4-
vectors A and j in a similar way, Eq. (119) does not depend on the reference frame choice. Thus we
have arrived at a key point of this chapter: we see that the Maxwell equations are indeed form-invariant
with respect to the Lorentz transform. As a by-product, the 4-vector form (119) of these equations (for
potentials) is extremely simple – and beautiful!
However, as we have seen in Chapter 7, for many applications the Maxwell equations for the
field vectors are more convenient; so let us represent them in the 4-form as well. For that, we may
express all Cartesian components of the usual (3D) field vector vectors (6.7),
A
E , B A, (9.121)
t
via those of the potential 4-vector A. For example,
Ax Ax
Ex c c 0 A1 1 A 0 , (9.122)
x t x c ( ct )
Az Ay
Bx 2 A3 3 A 2 . (9.123)
y z
Completing similar calculations for other field components (or just generating them by appropriate
index shifts), we find that the following antisymmetric, contravariant field-strength tensor,
F A A , (9.124)
48 Named after Jean-Baptiste le Rond d’Alembert (1717-1783), who has made several pioneering contributions to
the general theory of waves – see, e.g., CM Chapter 6. (Some older textbooks use notation 2 for this operator.)
49
In Gaussian units, this formula, as well as Eq. (131) for G, do not have the factor c in all the denominators.
Chapter 9 Page 27 of 56
Essential Graduate Physics EM: Classical Electrodynamics
If Eq. (124) looks a bit too bulky, please note that as a reward, the pair of inhomogeneous
Maxwell equations, i.e. two equations of the system (6.99), which in free space (D = 0E, B = 0H) may
be rewritten as
E E
0 c , B 0 j , (9.126)
c (ct ) c
may now be expressed in a very simple (and manifestly form-invariant) way,
Maxwell
F 0 j , (9.127) equation
for tensor F
which is comparable with Eq. (119) in its simplicity – and beauty. Somewhat counter-intuitively, the
pair of homogeneous Maxwell equations of the system (6.99),
B
E 0, B 0, (9.128)
t
look, in the 4-vector notation, a bit more complicated:50
F F F 0 . (9.129)
Note, however, that Eqs. (128) may be also represented in a much simpler 4-form,
G 0 , (9.130)
using the so-called dual tensor
0 Bx By Bz
Bx 0 Ez / c Ey / c
G , (9.131)
By Ez / c 0 Ex / c
B Ey / c Ex / c 0
z
which may be obtained from F, given by Eq. (125a), by the following replacements:
E E
B, B . (9.132)
c c
Besides the proof of the form-invariance of the Maxwell equations with respect to the Lorentz
transform, the 4-vector formalism allows us to achieve our initial goal: to find out how the electric and
magnetic field components change at the transfer between two (inertial!) reference frames. For that, let
us apply to the tensor F the reciprocal Lorentz transform described by the second of Eqs. (109).
Generally, it gives, for each field component, a sum of 16 terms, but since (for our choice of
coordinates, shown in Fig. 1) there are many zeros in the Lorentz transform matrix, and the diagonal
components of F equal zero as well, the calculations are rather doable. Let us calculate, for example,
E’x –cF’01. The only non-zero terms on the right-hand side are
x' 0 x' 1 10 x' 0 x' 1 01
E' x cF' 01 c 1 F 0
E
F c 2 2 1 x E x . (9.133)
x x x x
0 1
c
50To be fair, note that just as Eq. (127), Eq. (129) is also a set of four scalar equations – in the latter case with the
indices , , and taking any three different values of the set {0, 1, 2, 3}.
Chapter 9 Page 28 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Repeating the calculation for the other five components of the fields, we get very important relations
E' x E x , B' x B x ,
E' y E y vBz ,
B' y B y vE z / c 2 , (9.134)
E' z E z vB y , B' z B z vE y / c 2 ,
whose more compact “semi-vector” form is
Lorentz
transform
E' E , B' B ,
(9.135)
of field
components E ' E v B ,
B' B v E / c 2 ,
where the indices and stand, respectively, for the field components parallel and normal to the
relative velocity v of the two reference frames. In the non-relativistic limit, the Lorentz factor tends to
1, and Eqs. (135) acquire an even simpler form
1
E' E v B, B' B vE. (9.136)
c2
Thus we see that the electric and magnetic fields are transformed to each other even in the first
order of the v/c ratio. For example, if we fly across the field lines of a uniform, static, purely electric
field E (e.g., the one in a plane capacitor) we will see not only the electric field’s renormalization (in the
second order of the v/c ratio), but also a non-zero dc magnetic field B’ perpendicular to both the vector
E and the vector v, i.e. to the direction of our motion. This is of course what might be expected from the
relativity principle: from the point of view of the moving observer (which is as legitimate as that of a
stationary observer), the surface charges of the capacitor’s plates, that create the field E, move back
creating the dc currents (114), which induce the magnetic field B’. Similarly, motion across a magnetic
field creates, from the point of view of the moving observer, an electric field.
This fact is very important conceptually. One may say there is no such thing in Mother Nature as
an electric field (or a magnetic field) all by itself. Not only can the electric field induce the magnetic
field (and vice versa) in dynamics, but even in an apparently static configuration, what exactly we
measure depends on our speed relative to the field sources – justifying once again the term
electromagnetism for the field of physics we are studying in this course.
Another simple but very important application of Eqs. (134)-(135) is the calculation of the fields
created by a charged particle moving in free space by inertia, i.e. along a straight line with constant
velocity u, at the impact parameter51 (the closest distance) b from the observer. Selecting the reference
frame 0’ to move with the particle in its origin, and the reference frame 0 to reside in the “lab” in which
the fields E and B are measured, we can use the above formulas with v = u. In this case, the fields E’
and B’ may be calculated from, respectively, electro- and magnetostatics:
q r'
E' , B' 0 , (9.137)
4 0 r' 3
because in frame 0’, the particle does not move. Selecting the coordinate axes so that at the
measurement point, x = 0, y = b, z = 0 (Fig. 11a), for this point we may write x’ = –ut’, y’ = b, z’ = 0, so
r’ = (u2t’2 + b2)1/2, and the Cartesian components of the fields (137) are:
51 This term is very popular in the theory of particle scattering – see, e.g., CM Sec. 3.7.
Chapter 9 Page 29 of 56
Essential Graduate Physics EM: Classical Electrodynamics
q ut' q b
E' x , E' y , E' z 0,
4 0 u 2 t' 2 b
2 3/ 2 4 0 u 2 t' 2 b 2 3 / 2 (9.138)
B' x B' y B' z 0 .
(a) (b)
1.5
y y' 1
E y , B z / uc 2
b r'
0.5
q x
0 0
0' v u x' Ex
ut'
z z' 0.5 Fig. 9.11. The field pulses
induced by a uniformly
1
3 2 1 0 1 2 3 moving charge.
ut / b
Now using the last of Eqs. (19b) with x = 0, giving t’ = t, and the relations reciprocal to Eqs.
(134) for the field transform (they are similar to the direct transform but with v replaced with –v = –u),
in the lab frame we get
q u t q b
E x E' x , E y E' y , E z 0, (9.139)
4 0 u 2 2 t 2 b
2 3/ 2
4 0 u 2 2 t 2 b 2 3/ 2
u u q b u
B x 0, B y 0, Bz E' y Ey . (9.140)
c 2
c 4 0 u t b 2
2 2 2 2
3/ 2
c2
These results,52 plotted in Fig. 11b in the units of q2/40b2, reveal two major effects. First, the
charge passage by the observer generates not only an electric field pulse but also a magnetic field pulse.
This is natural, because, as was repeatedly discussed in Chapter 5, any charge motion is essentially an
electric current.53 Second, Eqs. (139)-(140) show that the pulse duration scale is
1/ 2
b b u2
t 1 , (9.141)
u u c 2
i.e. shrinks to virtually zero as the charge’s velocity u approaches the speed of light. This is of course a
direct corollary of the relativistic length contraction. Indeed, in the frame 0’ moving with the charge, the
longitudinal spread of its electric field at distance b from the motion line is of the order of x’ = b.
When observed from the lab frame 0, this interval, in accordance with Eq. (20), shrinks to x = x’/ =
b/, and hence so does the pulse duration scale t = x/u = b/u.
Chapter 9 Page 30 of 56
Essential Graduate Physics EM: Classical Electrodynamics
dp 1 E
qF 1 u q x c 0 (u x ) ( B z )(u y ) B y (u z ) q E u Bx , (9.146)
d c
and similarly for two other spatial components ( = 2 and = 3). It may look that these expressions
differ from the 2nd Newton law (144) by an extra factor of . However, plugging into Eq. (146) the
definition of the proper time interval, d = dt/, and canceling in both parts, we recover Eq. (144)
exactly – for any velocity of the particle! The only caveat is that if u is comparable with c, the vector p
in Eq. (144) has to be understood as the relativistic momentum (70), proportional to the velocity-
dependent mass M = m m rather than to the rest mass m.
The only remaining general task is to examine the meaning of the 0th component of Eq. (145).
Let us spell it out:
dp 0 E Ey
u y z u z q u . (9.147)
E E
qF 0 u q 0 c x u x
d c c c c
Recalling that p0 = E/c, and using the basic relation d = dt/ again, we see that Eq. (147) looks exactly
like the non-relativistic relation for the kinetic energy change (what is sometimes called the work-energy
principle, in our case for the Lorentz force only54):
54 See, e.g., CM Eq. (1.20) divided by dt, and with dp/dt = F = qE. (As a reminder, the magnetic field cannot
affect the particle’s energy, because the magnetic component of the Lorentz force is perpendicular to its velocity.)
Chapter 9 Page 31 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Particle’s
dE
qE u , (9.148) energy:
dt evolution
besides that in the relativistic case, the energy has to be taken in the general form (73).
Without question, the 4-component equation (145) of the relativistic dynamics is absolutely
beautiful in its simplicity. However, for the solution of particular problems, Eqs. (144) and (148) are
frequently more convenient. As an illustration of this point, let us now use these equations to explore
relativistic effects at charged particle motion in uniform, time-independent electric and magnetic fields.
In doing that, we will, for the time being, neglect the contributions into the field by the particle itself.55
(i) Uniform magnetic field. Let the magnetic field be constant and uniform in the “lab” reference
frame 0 that is used for measurements. Then in this frame, Eqs. (144) and (148) yield
dp dE
qu B, 0. (9.149)
dt dt
From the second equation, E = const, we get u = const, u/c = const, (1 – 2)-1/2 = const, and M
m = const, so the first of Eqs. (149) may be rewritten as
du
u ωc , (9.150)
dt
where c is the vector directed along the magnetic field B, with the magnitude equal to the following
cyclotron frequency (sometimes called “gyrofrequency”):
qB qB qc 2 B Cyclotron
c . (9.151) frequency
M m E
If the particle’s initial velocity u0 is perpendicular to the magnetic field, Eq. (150) describes its
circular motion, with a constant speed u = u0, in a plane normal to B, with the angular velocity (151). In
the non-relativistic limit u << c, when 1, i.e. M m, the cyclotron frequency c equals qB/m, i.e. is
independent of the speed. However, as the kinetic energy of the particle is increased to become
comparable with its rest energy mc2, the frequency decreases, and in the ultra-relativistic limit,
B qB
c qc , for u c . (9.152)
p m
The cyclotron motion’s radius may be calculated as R = u/c; in the non-relativistic limit, it is
proportional to the particle’s speed, i.e. to the square root of its kinetic energy. However, as Eq. (151)
shows, in the general case the radius is proportional to the particle’s relativistic momentum rather than
its speed:
u Mu mu 1 p Cyclotron
R , (9.153)
c qB qB qB radius
55 As was emphasized earlier in this course, in statics this contribution is formally infinite and has to be ignored.
In dynamics, this is generally not true; these self-action effects (which are, in most cases, negligible) will be
discussed in the next chapter.
Chapter 9 Page 32 of 56
Essential Graduate Physics EM: Classical Electrodynamics
These dependencies of c and R on energy are the major factors in the design of circular
accelerators of charged particles. In the simplest of these machines (the cyclotron, invented in 1929 by
Ernest Orlando Lawrence), the frequency of the accelerating ac electric field is constant, so even if it
is tuned to the c of the initially injected particles, the drop of the cyclotron frequency with energy
eventually violates this tuning. Due to this reason, the largest achievable particle’s speed is limited to
just ~0.1 c (for protons, corresponding to the kinetic energy of just ~15 MeV). This problem may be
addressed in several ways. In particular, in synchrotrons (such as Fermilab’s Tevatron and the CERN’s
Large Hadron Collider, LHC56) the magnetic field is gradually increased in time to compensate for the
momentum increase (B p), so both R (148) and c (147) stay constant, enabling proton acceleration to
energies as high as ~ 7 TeV, i.e. ~2,000 mc2.57
Returning to our initial problem, if the particle’s initial velocity has a component u along the
magnetic field, then it is conserved in time, so the trajectory is a spiral around the magnetic field lines.
As Eqs. (149) show, in this case, Eq. (150) remains valid but in Eqs. (151) and (153) the full speed and
momentum have to be replaced with magnitudes of their (also time-conserved) components, u and p,
normal to B, while the Lorentz factor in those formulas still includes the full speed of the particle.
Finally, in the special case when the particle’s initial velocity is directed exactly along the
magnetic field’s direction, it continues to move straight along the vector B. In this case, the cyclotron
frequency still has the non-zero value (151) but does not correspond to any real motion, because R = 0.
(ii) Uniform electric field. This problem is (technically) more complex than the previous one
because in the electric field, the particle’s energy changes. Directing the z-axis along the field E, from
Eq. (144) we get
dp z dp
qE , 0. (9.154)
dt dt
If E does not change in time, the first integration of these equations is elementary,
p z (t ) p z (0) qEt , p (t ) const p (0) , (9.155)
but the further integration requires care because the effective mass M = m of the particle depends on its
full speed u, with
u 2 u z2 u 2 , (9.156)
making the two motions, along and across the field, mutually dependent.
If the initial velocity is perpendicular to the field E, i.e. if pz(0) = 0, p(0) = p(0) p0, the easiest
way to proceed is to calculate the kinetic energy first:
On the other hand, we can calculate the same energy by integrating Eq. (148),
56 See https://home.cern/topics/large-hadron-collider.
57 I am sorry I have no more time/space to discuss particle accelerator physics, and have to refer the
interested reader to special literature, for example, either S. Lee, Accelerator Physics, 2nd ed., World
Scientific, 2004, or E. Wilson, An Introduction to Particle Accelerators, Oxford U. Press, 2001.
Chapter 9 Page 33 of 56
Essential Graduate Physics EM: Classical Electrodynamics
dE dz
qE u qE , (9.158)
dt dt
over time, with a simple result:
E E0 qEz (t ), (9.159)
where (just for the notation simplicity) I took z(0) = 0. Requiring Eq. (159) to give the same E 2 as Eq.
(157), we get a quadratic equation for the function z(t),
whose solution (with the sign before the square root corresponding to E > 0, i.e. to z 0) is
1/ 2
E0 cqEt
2
z (t )
1
1 . (9.161)
qE E0
Now let us find the particle’s trajectory. Directing the x-axis so that the initial velocity vector
(and hence the velocity vector at any further instant) is within the [x, z] plane, i.e. that y(t) = 0
identically, we may use Eqs. (155) to calculate the trajectory’s slope, at its arbitrary point, as
dz dz / dt Mu z p qEt
z . (9.162)
dx dx / dt Mu x px p0
Now let us use Eq. (160) to express the numerator of this fraction, qEt, as a function of z:
qEt
1
c
E0 qEz 2 E02
1/ 2
. (9.163)
This differential equation may be readily integrated separating the variables z and x, and using the
substitution cosh–1(qEz/E0 +1). Selecting the origin of axis x at the initial point, so x(0) = 0, we
finally get the trajectory:
E qEx
z 0 cosh 1 . (9.165)
qE cp 0
This curve is usually called the catenary, but sometimes the “chainette” – because it (with the
proper constant replacement) describes, in particular, the stationary shape of a heavy uniform chain in a
uniform gravity field directed along the z-axis. At the initial part of the trajectory, where qEx << cp0(0),
this expression may be approximated with the first non-zero term of its Taylor expansion in small x,
giving the following parabola:
2
E0 qE x
z , (9.166)
2 cp 0
so if the initial velocity of the particle is much lower than c (i.e. p0 mu0, E0 mc2), we get the very
familiar non-relativistic formula:
Chapter 9 Page 34 of 56
Essential Graduate Physics EM: Classical Electrodynamics
qE 2 a 2 F qE
z x t , with a . (9.167)
2mu 02 2 m m
The generalization of this solution to the case of an arbitrary direction of the particle’s initial
velocity is left for the reader’s exercise.
(iii) Crossed uniform magnetic and electric fields (E B). In view of the somewhat bulky
solution of the previous problem (i.e. the particular case of the current problem for B = 0), one might
think that this problem, with B 0, should be forbiddingly complex for an analytical solution. Counter-
intuitively, this is not the case, due to the help from the field transform relations (135). Let us consider
two possible cases.
Case 1: E/c < B. Let us consider an inertial reference frame 0’ moving (relatively the “lab”
reference frame 0 in that the fields E and B are measured) with the following velocity:
EB
v
, (9.168)
B2
and hence the speed v = c(E/c)/B < c. Selecting the coordinate axes as shown in Fig. 12, so
E x 0, E y E , E z 0; B x 0, B y 0, B z B , (9.169)
y y'
E
Chapter 9 Page 35 of 56
Essential Graduate Physics EM: Classical Electrodynamics
qB'
' c , (9.172)
E'/c 2
and the radius (153):
p'
R' . (9.173)
qB'
Hence in the lab frame, the particle performs this orbital/spiral motion plus a “drift” with the
constant velocity v (Fig. 12). As a result, the lab-frame trajectory of the particle (or rather its projection
onto the plane normal to the magnetic field) is a trochoid-like curve58 that, depending on the initial
velocity, may be either prolate (self-crossing), as in Fig. 12, or curtate (drift-stretched so much that it is
not self-crossing).
Such looped motion of electrons is used, in particular, in magnetrons – very popular generators
of microwave radiation. In such a device (Fig. 13), the magnetic field, usually created by specially-
shaped permanent magnets, is nearly uniform (in the region of electron motion) and directed along the
magnetron’s axis (in Fig. 13, normal to the plane of the drawing), while the electric field of magnitude E
<< cB, created by the dc voltage applied between the anode and the cathode, is virtually radial.
As a result, the above simple theory is only approximately valid, and the electron trajectories are
close to epicycloids rather than trochoids. The applied electric field is adjusted so that these looped
trajectories pass close to the anode’s surface, and hence to the gap openings of the cylindrical
microwave cavities drilled in the anode’s bulk. The fundamental mode of such a cavity is quasi-lumped,
with the cylindrical walls working mostly as inductances, and the gap openings as capacitances, with the
microwave electric field concentrated in these openings. This is why the mode is strongly coupled to the
electrons “licking” the anode’s surface, and their interaction creates large positive feedback (equivalent
to negative damping), which results in intensive microwave self-oscillations at the cavities’ own
frequency.59 The oscillation energy, of course, is taken from the dc-field-accelerated electrons; due to
this energy loss, the looped trajectory of each electron gradually moves closer to the anode and finally
58 As a reminder, a trochoid may be described as the trajectory of a point on a rigid disk rolled along a straight
line. It’s canonical parametric representation is x = + acos , y = asin . (For a > 1, the trochoid is prolate, if a
< 1, it is curtate, and if a = 1, it is called the cycloid.) Note, however, that for our problem, the trajectory in the lab
frame is exactly trochoidal only in the non-relativistic limit v << c (i.e. E/c << B).
59 See, e.g., CM Sec. 5.4.
Chapter 9 Page 36 of 56
Essential Graduate Physics EM: Classical Electrodynamics
lands on its surface. The wide use of such generators (in particular, in microwave ovens, which operate
in a narrow frequency band around 2.45 GHz, allocated for these devices to avoid their interference with
wireless communication systems) is due to their simplicity and high (up to 65%) efficiency of the dc-to-
rf energy transfer.
Case 2: E/c > B. In this case, the speed given by Eq. (168) would be above the speed of light, so
let us introduce a reference frame moving with a different velocity,
EB
v , (9.174)
E / c 2
whose direction is the same as before (Fig. 12), and magnitude v = cB/(E/c) is again below c. A
calculation absolutely similar to the one performed above for Case 1, yields
vB v2 E
E' x 0, E' y E vB E 1 E 1 2 E , E' z 0, (9.175)
E c
vE EB
B' x 0, B' y 0, B' z B 2 B 0. (9.176)
c E
so in the moving frame the particle “sees” only the electric field E’ E. According to the solution of our
previous problem (ii), the trajectory of the particle in the moving frame is the catenary (165), so in the
lab frame it has an “open”, hyperbolic character as well.
To conclude this section, let me note that if the electric and magnetic fields are nonuniform, the
particle motion may be much more complex, and in most cases, the integration of the system of
equations (144) and (148) may be carried out only numerically. However, if the field’s nonuniformity is
small, approximate analytical methods may be very effective. For example, if E = 0, and the magnetic
field has a small transverse gradient B in a direction normal to the vector B itself, such that
B 1
, (9.177)
B R
where R is the cyclotron radius (153), then it is straightforward to use Eq. (150) to show60 that the
cyclotron orbit drifts perpendicular to both B and B, with the drift speed
1 2 2
vd u u u . (9.178)
c 2
The physics of this drift is rather simple: according to Eq. (153), the instant curvature of the
cyclotron orbit is proportional to the local value of the field. Hence if the field is nonuniform, the
trajectory bends slightly more on its parts passing through a stronger field, thus acquiring a shape close
to a curate trochoid.
For experimental physics and engineering practice, the effects of longitudinal gradients of
magnetic field on the charged particle motion are much more important, but it is more convenient for me
to postpone their discussion until we have developed a little bit more analytical tools in the next section.
60 See, e.g., Sec. 12.4 in J. Jackson, Classical Electrodynamics, 3rd ed., Wiley, 1999.
Chapter 9 Page 37 of 56
Essential Graduate Physics EM: Classical Electrodynamics
where u is the 4-velocity (63). To comply with Eq. (180) at u << c, the constant factor should be equal
to (–q), so Eq. (181) becomes
L mc 2 qu A , (9.182)
and with the account of Eqs. (63) and the second of Eqs. (116), we get very important equality
mc 2 Particle’s
L q q u A , (9.183) Lagrangian
function
Chapter 9 Page 38 of 56
Essential Graduate Physics EM: Classical Electrodynamics
q E x u y B z u z B y q E u B x ,
dp x
(9.190)
dt
i.e. the x-component of Eq. (144). Since other Cartesian coordinates participate in Eq. (184) similarly, it
is evident that the Lagrangian equations of motion along other coordinates yield other components of
the same vector equation of motion.
So, Eq. (183) does indeed give the correct Lagrangian function, and we can use it for further
analysis, in particular to discuss the first of Eqs. (186). This relation shows that in the electromagnetic
field, the generalized momentum corresponding to the particle’s coordinate x is not px = mux, but63
L
Px p x qAx . (9.191)
u x
Thus, as was already discussed (at that point, without proof) in Sec. 6.4, the particle’s motion in a
magnetic field may be is described by two different linear momentum vectors: the kinetic momentum p
defined by Eq. (70), and the canonical (or “conjugate”) momentum64
Particle’s
canonical P p qA . (9.192)
momentum
In order to facilitate discussion of this notion, let us generalize Eq. (72) for the Hamiltonian
function H of a free particle to the case of a particle in the field:
mc 2 mc 2
H P u L (p qA) u qu A q p u q . (9.193)
62 Alternatively called the “Lagrangian derivative”; for its (rather simple) derivation see, e.g., CM Sec. 8.3.
63 With regrets, I have to use for the generalized momentum the same (very common) notation as was used earlier
in the course for the electric polarization – which will not be discussed here and in the balance of these notes.
64 In the Gaussian units, Eq. (192) has the form P = p + qA/c.
Chapter 9 Page 39 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Merging the first two terms of the last expression exactly as it was done in Eq. (72), we get an extremely
simple result,
H mc 2 q , (9.194a)
which may be spelled out as
1/ 2
p 2
H 1 mc 2 q , i.e. (H q ) 2 (mc 2 ) 2 c 2 p 2 . (9.194b)
mc
These expressions may leave the reader wondering: where is the vector potential A here – and
the magnetic field effects it has to describe? The resolution of this puzzle is easy: as we know from
analytical mechanics,65 for most applications, for example for an alternative derivation of the equations
of motion, H has to be represented as a function of the particle’s generalized coordinates (in the case of
unconstrained motion, these may be the Cartesian components of the vector r that serves as an argument
for the potentials A and ), and the generalized momenta, i.e. the components of the vector P –
generally, plus time. For that, the kinematic momentum p in Eq. (194b) has to be expressed via these
variables. This may be done using Eq. (192), giving us the following generalization of Eq. (78):66
Particle’s
(H q ) 2 (mc 2 ) 2 c 2 (P qA) 2 . (9.195) Hamiltonian
function
It is straightforward to verify that the Hamilton equations of motion for three Cartesian
coordinates of the particle, obtained in a regular way from this H, may be merged into the same vector
equation (144). In the non-relativistic limit, performing the expansion of Eqs. (194b) into the Taylor
series in p2, and limiting it to two leading terms, we get the following generalization of Eq. (74):
p2 1
H mc 2
q , i.e. H mc 2 (P qA) 2 U , with U q . (9.196)
2m 2m
These expressions for H, and Eq. (183) for L, give a clear view of the electromagnetic field
effects’ description in analytical mechanics. The electric part qE of the total Lorentz force can perform
mechanical work on the particle, i.e. change its kinetic energy – see Eq. (148) and its discussion. As a
result, the scalar potential , whose gradient gives a contribution to E, may be directly associated with
the potential energy U = q of the particle. On the contrary, the magnetic component quB of the
Lorentz force is always perpendicular to the particle’s velocity u, and cannot perform a non-zero work
on it, and as a result, cannot be described by a contribution to U. However, if A did not participate in the
functions L and/or H at all, the analytical mechanics would be unable to describe effects of the
magnetic field B = A on the particle’s motion. The relations (183) and (195)-(196) show the
wonderful way in which physics (with some help from Mother Nature herself :-) solves this problem:
the vector potential gives such contributions to the functions L and H that cannot be uniquely
attributed to either kinetic or potential energy, but ensure both the Lagrange and Hamilton formalisms
yield the correct equation of motion (144) – including the magnetic field effects.
Chapter 9 Page 40 of 56
Essential Graduate Physics EM: Classical Electrodynamics
I believe I still owe the reader some discussion of the physical sense of the canonical momentum
P. For that, let us consider a charged particle moving near a region of localized magnetic field B(r,t), but
not entering this region (see Fig. 14), so on its trajectory A B = 0.
B (r , t )
If there is no electrostatic field affecting the particle (i.e. no other electric charges nearby), we
may select such a local gauge that (r, t) = 0 and A = A(t), so Eq. (144) is reduced to
dp dA
qE q , (9.197)
dt dt
and Eq. (192) immediately gives
dP dp dA
q 0. (9.198)
dt dt dt
Hence, even if the magnetic field is changed in time, so that the induced electric field E does accelerate
the particle, its canonical momentum does not change. Hence P is a variable more stable to magnetic
field changes than its kinetic counterpart p. This conclusion may be criticized because it relies on a
specific gauge, and generally P p + qA is not gauge–invariant, because the vector potential A is not.67
However, as was already discussed in Sec. 5.3, the integral Adr over a closed contour is gauge-
invariant and is equal to the magnetic flux through the area limited by the contour – see Eq. (5.65).
So, integrating Eq. (197) over a closed trajectory of a particle (Fig. 14), and over the time of one orbit,
we get
Δ p dr qΔΦ, so that Δ P dr 0 , (9.199)
C C
where is the change of flux during that time. This gauge-invariant result confirms the above
conclusion about the stability of the canonical momentum to magnetic field variations.
Generally, Eq. (199) is invalid if a particle moves inside a magnetic field and/or changes its
trajectory at the field variation. However, if the field is almost uniform, i.e. its gradient is small in the
sense of Eq. (177), this result is (approximately) applicable. Indeed, analytical mechanics68 tells us that
for any canonical coordinate-momentum pair {qj, pj}, the corresponding action variable,
1
Jj
2 p dq
j j , (9.200)
remains virtually constant at slow variations of motion conditions. According to Eq. (191), for a particle
in a magnetic field, the generalized momentum corresponding to the Cartesian coordinate rj is Pj rather
than pj. Thus forming the net action variable J Jx + Jy + Jz , we may write
67 In contrast, the kinetic momentum p = Mu is evidently gauge- (though not Lorentz-) invariant.
68 See, e.g., CM Sec. 10.2.
Chapter 9 Page 41 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Let us apply this relation to the motion of a non-relativistic particle in an almost uniform
magnetic field, with a relatively small longitudinal velocity, u / u 0 – see Fig. 15.
u
R
B
u
Fig. 9.15. Particle in a magnetic field with
a small longitudinal gradient B B.
In this case, in Eq. (201) is the flux encircled by the particle’s cyclotron orbit, = –R2B,
where R is its radius given by Eq. (153), and the negative sign accounts for the fact that in our case, the
“correct” direction of the normal vector n in the definition of flux, = Bnd2r, is antiparallel to the
vector B. At u << c, the kinetic momentum is just p = mu, while Eq. (153) yields
mu qBR . (9.202)
Plugging these relations into Eq. (201), we get
qRB
2J mu 2R qR 2 B m 2R qR 2 B (2 1)qR 2 B q . (9.203)
m
This means that even if the circular orbit slowly moves through the magnetic field, the flux encircled by
the cyclotron orbit should remain virtually constant. One manifestation of this effect is the result already
mentioned at the end of Sec. 6: if a small gradient of the magnetic field is perpendicular to the field
itself, then the particle orbit’s drift direction is perpendicular to B, so stays constant.
Now let us analyze the case of a small longitudinal gradient, B B (Fig. 15). If a small initial
longitudinal velocity u is directed toward the higher field region, the cyclotron orbit has to gradually
shrink to keep constant. Rewriting Eq. (202) as
R 2 B
mu q q , (9.204)
R R
we see that this reduction of R (at constant ) increases the orbiting speed u. But since the magnetic
field cannot perform any work on the particle, its kinetic energy,
E
2
m 2
u u 2 , (9.205)
should stay constant, so the longitudinal velocity u has to decrease. Hence eventually the orbit’s drift
has to stop, and then it has to start moving back toward the region of lower fields, being essentially
repulsed from the high-field region. This effect is very important, in particular, for plasma confinement
systems. In the simplest of such systems, two coaxial magnetic coils, inducing magnetic fields of the
same direction (Fig. 16), naturally form a “magnetic bottle”, which traps charged particles injected, with
sufficiently low longitudinal velocities, into the region between the coils. More complex systems of this
Chapter 9 Page 42 of 56
Essential Graduate Physics EM: Classical Electrodynamics
type, but working on the same basic principle, are the most essential components of the persisting large-
scale efforts to achieve controllable nuclear fusion.69
B
Returning to the constancy of the magnetic flux encircled by free particles, it reminds us of the
Meissner-Ochsenfeld effect, which was discussed in Sec. 6.4, and gives a motivation for a brief revisit
of the electrodynamics of superconductivity. As was emphasized in that section, superconductivity is a
substantially quantum phenomenon; nevertheless, the classical notion of the conjugate momentum P
helps to understand its theoretical description. Indeed, the general rule of quantization of physical
systems70 is that each canonical pair {qj, pj} of a generalized coordinate qj and the corresponding
generalized momentum pj is described by quantum-mechanical operators that obey the following
commutation relation:
qˆ j , pˆ j' i jj' . (9.206)
According to Eq. (191), for the Cartesian coordinates rj of a particle in the magnetic field, the
corresponding generalized momenta are Pj, so their operators should obey the similar commutation
relations:
rˆ , Pˆ i .
j j' jj' (9.207)
In the coordinate representation of quantum mechanics, the canonical operators of the Cartesian
components of the linear momentum are described by the corresponding components of the vector
operator –i. As a result, ignoring the rest energy mc2 (which gives an inconsequential phase factor
exp{–imc2t/} in the wavefunction), we can use Eq. (196) to rewrite the usual non-relativistic
Schrödinger equation,
i Hˆ , (9.208)
t
as follows:
pˆ 2 1
i U i qA 2 q . (9.209)
t 2m 2 m
Thus, I believe I have finally delivered on my promise to justify the replacement (6.50), which
had been used in Secs. 6.4 and 6.5 to discuss the electrodynamics of superconductors, including the
Meissner-Ochsenfeld effect. The Schrödinger equation (209) may be also used as the basis for the
quantum-mechanical description of other magnetic field phenomena, including the so-called Aharonov-
Bohm and quantum Hall effects – see, e.g., QM Secs. 3.1-3.2.
69 For further reading on this technology, the reader may be referred, for example, to the simple monograph by F.
Chen, Introduction to Plasma Physics and Controllable Fusion, vol. 1, 2nd ed., Springer, 1984, and/or the
graduate-level theoretical treatment by R. Hazeltine and J. Meiss, Plasma Confinement, Dover, 2003.
70 See, e.g., CM Sec. 10.1.
Chapter 9 Page 43 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Let us start, as usual, from the Lagrange formalism. Some clues on the possible structure of the
Lagrangian function density l may be obtained from that of the particle-field interaction description in
this formalism, discussed in the last section. As we have seen, for the case of a single particle, the
interaction is described by the last two terms of Eq. (183):
Lint q qu A . (9.211)
Obviously, if the charge q is continuously distributed over some volume, we may represent this Lint as a
volume integral of the following Lagrangian function density:
Interaction
l int j A j A . (9.212) Lagrangian
density
Notice that this density (in contrast to Lint itself!) is Lorentz-invariant. (This is due to the
contraction of the longitudinal coordinate, and hence volume, at the Lorentz transform.) Hence we may
expect the density of the field’s part of the Lagrangian to be Lorentz-invariant as well. Moreover, given
the local structure of the Maxwell equations (containing only the first spatial and temporal derivatives of
the fields), l field should be a function of the potential’s 4-vector and its 4-derivative:
l field l field A , A . (9.213)
Also, the density should be selected in such a way that the 4-vector analog of the Lagrangian equation of
motion,
l field l
field 0, (9.214)
A
A
gave us the correct inhomogeneous Maxwell equations (127).71 The field part lfield of the total
Lagrangian density l should be a scalar and a quadratic form of the field strengths, i.e. of the tensor F,
so the natural choice is
l field const F F . (9.215)
with the implied summation over both indices. Indeed, adding to this expression the interaction
Lagrangian (212),
l l field l int const F F j A , (9.216)
71 Here the implicit summation over the index plays a role similar to the convective derivative (188) in
replacing the full derivative over time, in a way that reflects the symmetry of time and space in special relativity. I
do not want to spend more time justifying Eq. (214), because of the reasons that will be clear imminently.
Chapter 9 Page 44 of 56
Essential Graduate Physics EM: Classical Electrodynamics
and performing the differentiations, we see that Eqs. (214)-(215) indeed yield Eqs. (127), provided that
the constant factor equals (-1/40).72 So, the field’s Lagrangian density is
Field’s
1 1 E2 B2
Lagrangian
l field F F 2 B 2 0 E 2 ue um , (9.217)
density
4 0 2 0 c 2 2 0
where ue is the electric field energy density (1.65), and um is the magnetic field energy density (5.57).
Let me hope the reader agrees that Eq. (217) is a wonderful result because the Lagrangian function has a
structure absolutely similar to the well-known expression L = T – U of classical mechanics. So, for the
field alone, the “potential” and “kinetic” energies are separable again.73
Now let us explore whether we can calculate the 4-form of the field’s Hamiltonian function H.
In the generic analytical mechanics,
L
H q L . (9.218)
j qj j
However, just as for the Lagrangian function, for a field we should find the spatial density h of the
Hamiltonian, defined by the second of Eqs. (210), for which the natural 4-form of Eq. (218) is
l
h
A g l . (9.219)
( A )
Calculated for the field alone, i.e. using Eq. (217) for l, this definition yields
h
field
D , (9.220)
where the tensor
Symmetric
1 1
energy-
g F F g F F , (9.221)
momentum
tensor
0 4
is gauge-invariant, while the remaining term,
1
D g F A , (9.222)
0
is not, so it cannot correspond to any measurable variables. Fortunately, it is straightforward to verify
that the last tensor may be represented in the form
D
1
0
F A , (9.223)
Chapter 9 Page 45 of 56
Essential Graduate Physics EM: Classical Electrodynamics
so it does not interfere with the conservation properties of the gauge-invariant, symmetric energy-
momentum tensor (also called the symmetric stress tensor) , to be discussed below.
Let us use Eqs. (125) to express the elements of the latter tensor via the electric and magnetic
fields. For = = 0, we get
0 2 B2
E
00
ue um u , (9.225)
2 2 0
i.e. the expression for the total energy density u – see Eq. (6.113). The other 3 elements of the same
row/column turn out to be just the Cartesian components of the Poynting vector (6.114), divided by c:
1 E E Sj
j0 B H , for j 1, 2, 3 . (9.226)
0 c j c j c
The remaining 9 elements jj’ of the tensor, with j, j’ = 1, 2, 3, are usually represented as
jj '
(jjM' ) , (9.227)
where (M) is the so-called Maxwell stress tensor:
jj ' 1 Maxwell
(jjM' ) 0 E j E j ' E 2 B j B j ' jj ' B 2 , (9.228) stress
2 0 2 tensor
so the whole symmetric energy-momentum tensor (221) may be conveniently represented in the
following symbolic way:
u S / c
. (9.229)
S/c jj '
(M)
The physical meaning of this tensor may be revealed in the following way. Considering Eq.
(221) as the definition of the tensor ,74 and using the 4-vector form of Maxwell equations given by
Eqs. (127) and (129), it is straightforward to verify an extremely simple result for the 4-derivative of the
symmetric tensor:
F j . (9.230)
This expression is valid in the presence of electromagnetic field sources, e.g., for any system of charged
particles and the fields they have created. Of these four equations (for four values of the index ), the
temporal one (with = 0) may be simply expressed via the energy density (225) and the Poynting vector
(226):
u
S j E , (9.231)
t
while three spatial equations (with = j = 1, 2, 3) may be represented in the form
74 In this way, we are using Eq. (219) just as a useful guess, which has led us to the definition of , and may
leave its strict justification for more in-depth field theory courses.
Chapter 9 Page 46 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Sj 3
(M)
jj ' E j B j . (9.232)
t c 2
j '1 r j '
If integrated over a volume V limited by surface S, with the account of the divergence theorem,
Eq. (231) returns us to the Poynting theorem (6.111):
u
t j E d r S n d 2 r 0,
3
(9.233)
V S
V t c 2 j
f d r
j '1 S
(jjM' ) dA j ' , with f E j B, (9.234)
where dAj = njdA = njd2r is the jth component of the elementary area vector dA = ndA = nd2r that is
normal to the volume’s surface, and directed out of the volume – see Fig. 17.76
dF
n dA ndA
volume V
occupied by the field
dA j
dA
Fig. 9.17. The force dF exerted on a boundary
rj element dA of the volume V occupied by the field.
surface S
Since, according to Eq. (5.10), the vector f in Eq. (234) is nothing other than the density of
volume-distributed Lorentz forces exerted by the field on the charged particles, we can use the 2nd
Newton law, in its relativistic form (144), to rewrite Eq. (234), for a stationary volume V, as
Field d S 3
momentum’s 2 d r p part F , (9.235)
dynamics dt V c
where ppart is the total mechanical (relativistic) momentum of all particles in the volume V, and the
vector F is defined by its Cartesian components:
Force via 3
the Maxwell F j (jjM' ) d A j ' . (9.236)
tensor
j '1 S
Relations (235)-(236) are our main new results. The first of them shows that the vector
75 Just like the Poynting theorem (233), Eq. (234) may be obtained directly from the Maxwell equations, without
resorting to the 4-vector formalism – see, e.g., Sec. 8.2.2 in D. Griffiths, Introduction to Electrodynamics, 3rd ed.,
Prentice-Hall, 1999. However, the derivation discussed above is superior because it shows the wonderful unity
between the laws of conservation of energy and momentum.
76 The same notions are used in the mechanical stress theory – see, e.g., CM Sec. 7.2.
Chapter 9 Page 47 of 56
Essential Graduate Physics EM: Classical Electrodynamics
S
g , (9.237)
c2
already discussed in Sec. 6.8 without derivation, may be indeed interpreted as the density of momentum
of the electromagnetic field (per unit volume). This classical relation is consistent with the quantum-
mechanical picture of photons as ultra-relativistic particles, with a momentum of magnitude E/c, because
then the flux of the momentum carried by photons through a unit normal area per unit time may be
represented either as Sn/c or as gnc. It also allows us to revisit the Poynting vector paradox that was
discussed in Sec. 6.8 – see Fig. 611 and its discussion. As was emphasized in that discussion, in this
case, the vector S = EH does not correspond to any measurable energy flow. However, the
corresponding momentum of the field, equal to the integral of the density (237) over a volume of
interest,77 is not only real but may be measured by the recoil impulse it gives to the field sources – say,
to the magnetic coil inducing the field H, or to the capacitor plates creating the field E.
Now let us turn to our second result, Eq. (236). It tells us that the 33-element Maxwell stress
tensor complies with the general definition of the stress tensor78 characterizing the forces exerted on the
boundaries of a volume, in our current case the volume occupied by the electromagnetic field (Fig. 17).
Let us use this important result to analyze two simple examples of static fields.
(i) Electrostatic field’s effect on a perfect conductor. Since Eq. (235) has been derived for a free
space region, we have to select volume V outside the conductor, but we may align one of its faces with
the conductor’s surface (Fig. 18).
V
z
E Fig. 9.18. The electrostatic field
near a conductor’s surface.
From Chapter 2, we know that the electrostatic field just outside the conductor’s surface has to
be normal to it. Selecting the z-axis in this direction, we have Ex = Ey =0, Ez = E, so only diagonal
elements of the tensor (228) are not equal to zero:
0 0
xx( M ) yy( M ) E2, zz( M ) E2 , (9.238)
2 2
Since the elementary surface area vector has just one non-zero component, dAz, according to Eq. (236),
only the last component (that is positive regardless of the sign of E) gives a contribution to the surface
force F. We see that the force exerted by the conductor (and eventually by the external forces that hold
the conductor in its equilibrium position) on the field is normal to the conductor and directed out of the
field volume: dFz 0. Hence, by the 3rd Newton law, the force exerted by the field on the conductor’s
surface is directed toward the field-filled space:
Chapter 9 Page 48 of 56
Essential Graduate Physics EM: Classical Electrodynamics
0
Electric
field’s
dFsurface dFz E 2 dA . (9.239)
pull
2
This important result could be obtained by simpler means as well. (Actually, this was the task of
one of the exercise problems assigned in Chapter 2.) For example, one could argue, quite convincingly,
that the local relation between the force and the field should not depend on the global configuration
creating the field, and thus consider the simplest configuration, a planar capacitor (see, e.g. Fig. 2.3)
with surfaces of both plates charged by equal and opposite charges of density = 0E. According to
the Coulomb law, the charges should attract each other, pulling each plate toward the field region, so the
Maxwell-tensor result gives the correct direction of the force. Now the force’s magnitude given by Eq.
(239) may be verified either by the direct integration of the Coulomb law or by the following simple
reasoning. In the plane capacitor, the inner field Ez = /0 is equally contributed by two surface charges;
hence the field created by the negative charge of the counterpart plate (not shown in Fig. 18) is E- = –
/20, and the force it exerts of the elementary surface charge dQ = dA of the positively charged plate
is dFsurface = EdQ = –2dA/20 = 0E2dA/2, in accordance with Eq. (239).79
Quantitatively, even for such a high electric field as E = 105 V/m (close to the electric
breakdown’s threshold in the air at a frequency of 10 GHz80), the “negative pressure” (dF/dA) given by
Eq. (239) is of the order of 0.05 Pa (N/m2), i.e. many orders below the ambient atmospheric pressure of
1 bar 105 Pa. Still, this negative pressure may be substantial (well above 1 bar) in some cases, for
example in good dielectrics (such as the high-quality SiO2 grown at high temperature, which is broadly
used in integrated circuits), which can withstand electric fields up to ~109 V/m.
(ii) Static magnetic field’s effect on its source81 – say a solenoid’s wall or a superconductor’s
surface (Fig. 19). With the Cartesian coordinates’ choice shown in that figure, we have Bx = B, By = Bz =
0, so the Maxwell stress tensor (228) is diagonal again:
1 1
xx( M ) B2 , yy( M ) zz( M ) B2 . (9.240)
2 0 2 0
However, since for this geometry, only dAz differs from 0 in Eq. (236), the sign of the resulting force is
opposite to that in electrostatics: dFz 0, and the force exerted by the magnetic field upon the
conductor’s surface,
Magnetic
1
field’s
dFsurface dFz B 2 dA , (9.241)
push
2 0
79By the way, repeating these arguments for a plane capacitor filled with a linear dielectric, we may
readily see that Eq. (239) may be generalized for this case by replacing 0 with . A similar replacement
(0 ) is valid for Eq. (241) in a linear magnetic medium.
80 Note that the breakdown field Et in is a strong function of frequency. In the ambient air, it drops from its dc
value of ~3106 V/m to ~1.5105 V/m at microwave frequencies and then rises to as much as ~6109 V/m at
optical frequencies. The reason of the rise is that at very high frequencies, the amplitude of the field-induced
oscillations of the rare free electrons becomes much smaller than their mean free path, inhibiting the bulk impact-
ionization of neutral atoms. (Because of this reason, Et also depends on the air’s pressure.)
81 The causal relation is not important here. Especially in the case of a superconductor, the magnetic field may be
induced by another source, with the surface supercurrent j just shielding the superconductor’s bulk from its
penetration – see Sec. 6.
Chapter 9 Page 49 of 56
Essential Graduate Physics EM: Classical Electrodynamics
corresponds to positive pressure. For good laboratory magnets (B ~ 10 T), this pressure is of the order of
4107 Pa 400 bars, i.e. is very substantial, so the magnets require solid mechanical design.
x B
j
z Fig. 9.19. The magnetostatic field near
V a current-carrying surface.
The direction of the force (241) could be also readily predicted using elementary magnetostatics
arguments. Indeed, we can imagine the magnetic field volume limited by another, parallel wall with the
opposite direction of surface current. According to the starting point of magnetostatics, Eq. (5.1), such
surface currents of opposite directions have to repulse each other – doing that via the magnetic field.
Another explanation of the fundamental sign difference between the electric and magnetic field
pressures may be provided using the electric circuit language. As we know from Chapter 2, the potential
energy of the electric field stored in a capacitor may be represented in two equivalent forms,
CV 2 Q 2
Ue . (9.242)
2 2C
Similarly, the magnetic field energy of an inductive coil is
LI 2 2
Um . (9.243)
2 2L
If we do not want to consider the work of external sources at a virtual change of the system dimensions,
we should use the last forms of these relations, i.e. consider a galvanically detached capacitor (Q =
const) and an externally-shorted inductance ( = const).82 Now if we let the electric field forces (239)
drag the capacitor’s plates in the direction they “want”, i.e. toward each other, this would lead to a
reduction of the capacitor thickness, and hence to an increase of its capacitance C, and hence to a
decrease of Ue. Similarly, for a solenoid, allowing the positive pressure (241) to move its walls from
each other would lead to an increase of the solenoid’s volume, and hence of its inductance L, so the
potential energy Um would be also reduced – as it should be. It is remarkable (actually, beautiful!) how
the local field formulas (239) and (241) “know” about these global circumstances.
Finally, let us see whether the major results (237) and (241) obtained in this section, match each
other. For that, let us return to the normal incidence of a plane, monochromatic wave from the free space
upon the plane surface of a perfect conductor (see, e.g., Fig. 7.8 and its discussion), and use those results
to calculate the time average of the pressure dFsurface/dA imposed by the wave on the surface. At elastic
reflection from the conductor’s surface, the electromagnetic field’s momentum retains its amplitude but
reverses its sign, so the average momentum transferred to a unit area of the surface in a unit time (i.e.
the average pressure) is
82 Of course, this condition may hold “forever” only for solenoids with superconducting wiring, but even in
normal-metal solenoids with practicable inductances, the flux relaxation constants L/R may be rather large
(practically, up to a few minutes), quite sufficient to carry out the force measurement.
Chapter 9 Page 50 of 56
Essential Graduate Physics EM: Classical Electrodynamics
dFsurface S EH
2cg incident 2c incident
2
2c 2 E H * , (9.244)
dA c c
where E and H are complex amplitudes of the incident wave. Using the relation (7.7) between these
amplitudes (for = 0 and = 0 giving E = cB), we get
2
dFsurface 1 B* B
cB . (9.245)
dA c 0 0
On the other hand, as was discussed in Sec. 7.3, at the surface of a perfect mirror the electric
field vanishes while the magnetic field doubles, so we can use Eq. (241) with B B(t) = 2Re[Bexp{-
it}]. Averaging the pressure given by Eq. (241) over time, we get
2
dFsurface
dA
1
2 0
2 Re B e it
2
B
0
, (9.246)
9.1. Use the pre-relativistic picture of light propagation with velocity c in a Sun-bound aether to
derive Eq. (4).
9.2. Show that two successive Lorentz space/time transforms, with velocities u’ and v in the
same direction, are equivalent to the single transform with the velocity u given by Eq. (25).
9.3. N + 1 reference frames numbered by index n (taking values 0, 1, …, N) move in the same
direction as a particle. Express the particle’s velocity in the frame number 0 via its velocity uN in the
frame number N and the velocities vn of the frame number n relative to the frame number (n – 1).
9.4. A spaceship moving with a constant velocity v directly from the Earth sends back brief
flashes of light with a period ts – as measured by the spaceship's clock. Calculate the period with that
an Earth-based observer may receive these signals – as measured by their clock.
9.5. From the point of view of observers in a reference frame 0', a y y'
straight thin rod, parallel to the x'-axis, is moving without rotation with a u'
constant velocity u' directed along the y'-axis. The reference frame 0' is itself
moving relative to another ("lab") reference frame 0 with a constant velocity
v
v along the x-axis, also without rotation – see the figure on the right.
0 x 0' x'
Chapter 9 Page 51 of 56
Essential Graduate Physics EM: Classical Electrodynamics
Calculate:
(i) the direction of the rod's velocity, and
(ii) the orientation of the rod on the [x, y] plane,
– both as observed from the lab reference frame. Is the velocity, in this frame, perpendicular to the rod?
9.6. Starting from the rest at t = 0, a spaceship moves directly from the Earth, with a constant
acceleration as measured in its instantaneous rest frame. Find its displacement x(t) from the Earth, as
measured from the Earth’s reference frame, and interpret the result.
Hint: The instantaneous rest frame of a moving particle is the inertial reference frame that, at the
considered moment of time, has the same velocity as the particle.
9.7. Analyze the twin paradox for the simplest case of 1D travel with a piecewise-constant
acceleration.
Hint: You may use an intermediate result of the solution of the previous problem.
9.8. Suggest a natural definition of the 4-vector of acceleration (commonly called the 4-
acceleration) of a point and calculate its components for of a relativistic point moving with velocity u =
u(t).
9.9. Calculate the first relativistic correction to the frequency of a harmonic oscillator as a
function of its amplitude.
9.10. An atom with an initial rest mass m has been excited to an internal state with an additional
energy E, while still being at rest. Next, it returns to its initial state, emitting a photon. Calculate the
photon’s frequency, taking into account the relativistic recoil of the atom.
Hint: In this problem, and also in Problems 13-15 below, you may treat photons as classical
ultra-relativistic point particles with zero rest mass, energy E = , and momentum p = k.
9.11. A particle of mass m, initially at rest, decays into two particles with rest masses m1 and m2.
Calculate the total energy of the first product particle, in the c.o.m. reference frame.
9.12. A relativistic particle with a rest mass m, moving with velocity u, decays into two particles
with zero rest mass.
(i) Calculate the smallest possible angle between the decay product velocities (in the lab frame,
in that the velocity u is measured).
(ii) What is the largest possible energy of one product particle?
9.13. A relativistic particle flying in free space with velocity u decays into two photons.83
Calculate the angular dependence of the photon detection probability, as measured in the lab frame.
Chapter 9 Page 52 of 56
Essential Graduate Physics EM: Classical Electrodynamics
p
9.14. A photon with wavelength is scattered by an me
electron, initially at rest. Calculate the wavelength ’ of the
scattered photon as a function of the scattering angle –
see the figure on the right.84
'
9.15. Calculate the threshold energy of a -photon for the reaction
γ p p π0 ,
if the proton was initially at rest.
Hint: For protons, mpc2 938 MeV, while for neutral pions, mc2 135 MeV.
9.16. Calculate the largest possible velocity of the electrons emitted by (initially, resting)
neutrons at their -decays:
n p e e .
Hint: Electron neutrinos e and antineutrinose are virtually massless (on the energy scale of
this problem); the rest energies E mc2 of the other involved particles are as follows: 939.565 MeV for
the neutron, 938.272 MeV for the proton, 0.511 MeV for the electron.
9.17. A relativistic particle with a rest mass m and an energy E collides with a similar particle,
initially at rest in the laboratory reference frame. Calculate:
(i) the final velocity of the center of mass of the system, in the lab frame,
(ii) the total energy of the system, in the center-of-mass frame, and
(iii) the final velocities of both particles (in the lab frame), provided that they move along the
same direction.
9.18. A “primed” reference frame moves, relative to the “lab” frame, with a reduced velocity
v/c = nx. Use Eq. (109) to express the elements T’00 and T’0j (with j = 1, 2, 3) of an arbitrary
contravariant 4-tensor T via its elements in the lab frame.
9.20. Consider the situation when static fields E and B are uniform but arbitrary (both in
magnitude and in direction). What should be the velocity of an inertial reference frame to have the
vectors E’ and B’, observed from that frame, parallel? Is this solution unique?
q1 u
9.21. Two charged particles moving with equal constant velocities u are
offset by a constant vector R = {a, b} (see the figure on the right), as measured in b
the lab frame. Calculate the force of interaction between the particles – also in the q2 u
lab frame.
a
84This is the famous Compton scattering effect, whose discovery in 1923 was one of the major motivations for
the development of quantum mechanics – see, e.g., QM Sec. 1.1.
Chapter 9 Page 53 of 56
Essential Graduate Physics EM: Classical Electrodynamics
9.22. Each of two thin, long, parallel particle beams of the same velocity u, separated by distance
d, carries electric charges with a constant density per unit length, as measured in the reference frame
moving with the particles.
(i) Calculate the distribution of the electric and magnetic fields in the system (outside the
beams), as measured in the lab reference frame.
(ii) Calculate the interaction force between the beams (per particle) and the resulting
acceleration, both in the lab reference frame and in the frame moving with the particles.
(iii) Compare the results and give a brief discussion of their relation.
9.23.
(i) Spell out the Lorentz transform of the Cartesian components of the scalar potential and the
vector potential of an arbitrary electromagnetic field.
(ii) Use this general result to calculate the potentials of the field created by a point charge q
moving with a constant velocity u, as measured in the lab reference frame.
9.24. Calculate the scalar and vector potentials created by a time-independent electric dipole p,
as measured in a reference frame that moves relative to the dipole with a constant velocity v, with the
shortest distance (“impact parameter”) equal to b.
9.25. Solve the previous problem, in the limit v << c, for a time-independent magnetic dipole m.
9.26. Review the solution of Problem 23 (on the hypothetical magnetic monopole passing
through a superconducting ring) for the case when this particle moves with an arbitrary constant
velocity.
9.27. Re-derive Eq. (161) for the simplest case p(0) = 0, by using the 4-vector form (145) of the
equation of motion and the notion of rapidity tanh-1 that was briefly discussed in Sec. 2.
9.28.* Calculate the trajectory of a relativistic particle in a uniform electrostatic field E, for an
arbitrary direction of its initial velocity u(0), by using two different ways – at least one of them different
from the approach described in Sec. 6 for the case u(0) E.
9.29. A charged relativistic particle the rest mass m performs planar cyclotron rotation, with
velocity u, in a uniform external magnetic field of magnitude B. How much would the velocity and the
orbit’s radius change at a slow change of the field to a new magnitude B' ?
9.30.* Analyze the motion of a relativistic particle in uniform, mutually perpendicular fields E
and B, for the particular case when E is exactly equal to cB.
9.31. Find the law of motion of a relativistic particle in uniform static fields E and B parallel to
each other.
9.32. An external Lorentz force F is exerted on a relativistic particle with an electric charge q
and a rest mass m, moving with velocity u, as observed from some inertial “lab” frame. Calculate its
acceleration as observed from that frame.
Chapter 9 Page 54 of 56
Essential Graduate Physics EM: Classical Electrodynamics
9.33. Neglecting relativistic kinetic effects, calculate the lowest voltage V that has to be applied
between the anode and cathode of a magnetron (see Fig. 13 and its discussion) to enable electrons to
reach the anode, at negligible electron-electron interactions (including the space-charge effects) and
collisions with the residual gas molecules. You may:
(i) model the cathode and anode as two coaxial round cylinders, of radii R1 and R2, respectively;
(ii) assume that the magnetic field B is uniform and directed along their common axis; and
(iii) neglect the initial velocity of the electrons emitted by the cathode.
After the solution, estimate the validity of the last assumption and of the non-relativistic approximation,
for reasonable values of parameters.
9.34. A charged relativistic particle has been injected into a region with a uniform electric field
whose magnitude oscillates in time with frequency . Calculate the time dependence of the particle’s
velocity, as observed from the lab reference frame.
9.36. Analyze the motion of a non-relativistic particle in a region where the electric and
magnetic fields are both uniform and constant in time, but not necessarily parallel or perpendicular to
each other.
9.37. A static distribution of electric charge in otherwise free space has created a time-
independent distribution E(r) of the electric field. Use two different approaches to express the field
energy density u’ and the Poynting vector S’, as observed from a reference frame moving with a
constant velocity v, via the Cartesian components of the vector E. In particular, is S’ equal to (–vu’)?
9.38. A traveling plane wave of frequency and intensity S is normally incident on a perfect
mirror moving with velocity v in the same direction as the wave.
(i) Calculate the reflected wave’s frequency, and
(ii) use the Lorentz transform of the fields to calculate the reflected wave’s intensity
– both as observed from the lab reference frame.
9.39. Perform the second task of the previous problem by using general relations between the
wave’s energy, power, and momentum.
Hint: As a byproduct, this approach should also give you the pressure exerted by the wave on the
moving mirror.
Chapter 9 Page 55 of 56
Essential Graduate Physics EM: Classical Electrodynamics
9.41. Consider an electromagnetic plane wave packet propagating in free space, with its electric
field represented as the Fourier integral
ik
E(r, t ) Re E k e dk , with k kz k t , and k c k .
Express the full linear momentum (per unit area of wave’s front) of the packet via the complex
amplitudes Ek of its Fourier components. Does the momentum depend on time? (In contrast with
Problem 7.8, the wave packet is not necessarily narrow.)
9.42. Calculate the forces exerted on well-conducting walls of a waveguide with a rectangular
(ab) cross-section, by a wave propagating along it in the fundamental (H10) mode. Give an
interpretation of the results.
Chapter 9 Page 56 of 56
Essential Graduate Physics EM: Classical Electrodynamics
1 (r" , t R / c) 0 j(r" , t R / c) 3
(r, t )
4 0 R
d 3 r" , A(r, t )
4 R
d r" , with R r r" . (10.1a)
As a reminder, Eqs. (1a) were derived from the Maxwell equations without any restrictions, and are very
natural for situations with continuous distributions of the electric charge and/or current. However, for a
single charged particle, whose charge and current distributions may be described as
where r’ = r’(t) is the instantaneous position of the charge, it is more convenient to recast Eqs. (1a) into
an explicit form that would not require integration in each particular case. Indeed, as Eqs. (1) show, the
potentials at a given observation point {r, t} are contributed by only one specific point {r’(tret), tret} of
the particle’s 4D trajectory (called its world line), which satisfies the following condition:
Rret
t ret t
, (10.2)
c
where tret is called the retarded time, and Rret is the length of the following distance vector
R ret r t r' t ret (10.3)
– physically, the distance covered by the electromagnetic wave from its emission to observation.
The reduction of Eqs. (1a) to such a simpler form, however, requires some care. Their naïve
integration over r” would yield the following apparent but wrong results:
1 q r, t 0 qc 0 qu ret
r, t , i.e. ; Ar, t , (WRONG!) (10.4)
4 0 Rret c 4 Rret 4 Rret
© K. Likharev
Essential Graduate Physics EM: Classical Electrodynamics
where uret is the particle’s velocity at the retarded point r’(tret). Eqs. (4) is a good example of how the
relativity theory (even the special one :-) cannot be taken too lightly. Indeed, the strings (9.84)-(9.85),
formed from the apparent potentials (4), would not obey the Lorentz transform rule (9.91), because
according to Eqs. (2)-(3), the distance Rret also depends on the reference frame it is measured in.
In order to correct the error, we need, first of all, to discuss the conditions (2)-(3). Combining
them by eliminating Rret, we get the following equation for tret:
Retarded
c(t t ret ) r (t ) r' (t ret ) . (10.5) time
Figure 1 depicts the graphical solution of this self-consistency equation as the only1 point of intersection
of the light cone of the observation point (see Fig. 9.9 and its discussion) and the particle’s world line.
In Eq. (5), just as in Eqs. (1)-(3), all variables have to be measured in the inertial (“lab”)
reference frame in which the observation point r rests. Now let us write Eqs. (1) for a point charge in
another inertial frame the frame 0’ whose velocity (as measured in the lab frame) coincides, at the
moment t’ = tret, with the velocity uret of the charge.2 In that frame, the charge rests, so, as we know
from the electro- and magnetostatics,
q
' , A' 0 . (10.6a)
4 0 R'
(Remember that this R’ may not be equal to Rret, because the latter distance is measured in the “lab”
reference frame.) Let us use the identity 1/0 0c2 again to rewrite Eqs. (6a) in the form of components
of a 4-vector similar in structure to the last two of Eqs. (4):
' 0 qc
, A' 0 . (10.6b)
c 4 R'
Now it is easy to guess the correct answer for the 4-potential for an arbitrary reference frame:
1 As Fig. 1 shows, there is always another, “advanced” point {r’(tadv), tadv} of the particle’s world line, with tadv >
t, which is also a solution of Eq. (5), but it does not fit Eqs. (1), because the observation, at the point {r, t < tadv},
of the field induced at the advanced point, would violate the causality principle.
2 This is just a particular case of the instantaneous reference frame –the notion that was encountered in several
exercise problems of the previous chapter, and indeed was implied (though admittedly not sufficiently advertised)
as the derivation of the key Eq. (9.60).
Chapter 10 Page 2 of 40
Essential Graduate Physics EM: Classical Electrodynamics
0 qcu
A , (10.7)
4 u R
where (as a reminder) A {/c, A}, u {c, u}, and R is the 4-vector of the inter-event distance,
formed similarly to that of a single event – cf. Eq. (9.48):
where n R/R is a unit vector in the observer’s direction, u/c is the normalized velocity of the
particle, and 1/(1- u2/c2)1/2. In the instantaneous reference frame of the charge (in which = 0 and
= 1), the expression (9) is reduced to cR, so Eq. (7) is correctly reduced to Eq. (6b). Now let us spell out
the components of Eq. (7) for the lab frame (in which t’ = tret and R = Rret):
1 q 1 1
r, t q , (10.10a)
Liénard- 4 0 ( R β R ) ret 4 0 R(1 β n) ret
Wiechert
potentials 0 u β u
Ar, t q 0 qc r, t ret2 . (10.10b)
4 R β R ret 4 R(1 β n) ret c
These formulas are called the Liénard-Wiechert potentials.3 In the non-relativistic limit, they
coincide with the naïve guess (4), but in the general case include the additional factor 1/(1 – n)ret. Its
physical origin may be illuminated by one more formal calculation – whose result we will need anyway.
Let us differentiate the geometric relation (5), rewritten as
Rret c(t t ret ) , (10.11)
over tret and then, independently, over t, assuming that r is fixed. For that, let us first differentiate, over
tret, both sides of the identity Rret2 = RretRret:
Rret R ret
2 Rret 2R ret . (10.12)
t ret t ret
If r is fixed, then Rret/tret (r – r’)/tret = –r’/tret –uret, and Eq. (12) yields
Rret R ret R ret
n u ret . (10.13)
t ret Rret t ret
Now let us differentiate the same Rret over t. On one hand, Eq. (11) yields
3 They were derived in 1898 by Alfred-Marie Liénard and (independently) in 1900 by Emil Wiechert.
Chapter 10 Page 3 of 40
Essential Graduate Physics EM: Classical Electrodynamics
Rret t
c c ret . (10.14)
t t
On the other hand, according to Eq. (5), at the partial differentiation over time, i.e. if r is fixed, tret is a
function of t alone, so (using Eq. (13) at the second step), we may write
Rret Rret t ret t
n u ret ret . (10.15)
t ret t ret t t
Now requiring Eqs. (14) and (15) to give the same result, we get:4
t ret c 1
. (10.16) tret/t
t c n u ret 1 β n ret
This important relation may be readily re-derived (and more clearly understood) for the
particular case when the charge’s velocity is directed straight toward the observation point. In this case,
its vector u resides in the same space-time plane as the observation point’s world line r = const – say,
the plane [x, t] shown in Fig. 2.
t dt
dt dt ret dx ret / c
dt ret
dx ret / c
Fig. 10.2. Deriving Eq. (16)
for the case n = .
x
dx ret u ret dt ret
Let us consider an elementary time interval dtret dt’, during which the particle would travel the
space interval dxret = uretdtret. In Fig. 2, the corresponding segment of its world line is shown with a solid
vector. The dotted vectors in this figure show the world lines of the radiation emitted by the particle in
the beginning and at the end of this interval, and propagating with the speed of light c. As it follows
from the drawing, the time interval dt between the instants of the arrival of the radiation from these two
points to any time-independent spatial point of observation is
dx ret u dt ret 1 1
dt dt ret dt ret ret dt ret , so that . (10.17)
c c dt 1 u ret / c 1 ret
This expression coincides with Eq. (16) for our particular case when the directions of the vectors u/c
and n R/R (both taken at time tret) coincide, and hence (n)ret = ret. The difference between Eqs. (16)
and (17) may be interpreted by saying that the particle’s velocity in the transverse directions (normal to
the vector n) is not important for this kinematic effect5 – the fact almost evident from Fig. 1.
4 This relation may be used for an alternative derivation of Eqs. (10) directly from Eqs (1) – the calculation left
for the reader’s exercise.
5 Note that this effect (linear in ) has nothing to do with the Lorentz time dilation (9.21), which is quadratic in .
(Indeed, all our arguments above referred to the same, lab frame.) Rather, it is close in nature to the Doppler
effect.
Chapter 10 Page 4 of 40
Essential Graduate Physics EM: Classical Electrodynamics
So, the additional factor in the Liénard-Wiechert potentials is just the derivative tret/t. The
reason for its appearance in Eqs. (10) is usually interpreted along the following lines. Let the charge q
be spread along the direction of the vector Rret (in Fig. 2, along the x-axis) by an infinitesimal speed-
independent interval xret, so the linear density of its charge is proportional to 1/xret. Then the time
rate of charge’s arrival at some spatial point is uret = dxret/dtret, i.e. scales as 1/dtret. However, the rate
of radiation’s arrival at the observation point scales as 1/dt, so due to the non-zero velocity uret of the
particle, this rate differs from the charge arrival rate by the factor of dtret/dt, given by Eq. (16). (If the
particle moves toward the observation point, (n)ret > 0, as shown in Fig. 2, this factor is larger than 1.)
This radiation compression effect leads to the field change (at (n)ret > 0, its enhancement) by the same
factor (16) – as described by Eqs. (10).
So, the 4-vector formalism was very instrumental for the calculation of field potentials. It may be
also used to calculate the fields E and B – by plugging Eq. (7) into Eq. (9.124) to calculate the field
strength tensor. This calculation yields
0 q 1 d R u R u
F . (10.18)
4 u R d u R
Now using Eq. (9.125) to identify the elements of this tensor with the field components, we may bring
the result to the following vector form: 6
E
q nβ
n (n β) β
, (10.19)
Relativistic 4 0 2 (1 β n) 3 R 2 (1 β n) 3 cR ret
particle’s
fields
n ret E n ret E
B , i.e. H . (10.20)
c Z0
Thus the magnetic and electric fields of a relativistic particle are always proportional and
perpendicular to each other, and related just as in a plane wave – cf. Eq. (7.6), with the difference that
now the vector nret may be a function of time. Superficially, this result contradicts the electro- and
magnetostatics, because, for a particle at rest, B should vanish while E stays finite. However, note that
according to the Coulomb law for a point charge, in this case, E = Enret, so B nretE nretnret = 0.
(Actually, in these relations, the subscript “ret” is unnecessary.)
As a sanity check, let us use Eq. (19) as an alternative way to find the electric field of a charge
moving without acceleration, i.e. uniformly, along a straight line – see Fig. 9.11a reproduced, with
minor changes, in Fig. 3. (This calculation will also illustrate the technical challenges of practical
applications of the Liénard-Wiechert formulas for even simple cases.) In this case, the vector does not
change in time, so the second term in Eq. (19) vanishes, and all we need to do is to spell out the
Cartesian components of the first term.
6 An alternative way of deriving these formulas (highly recommended to the reader as an exercise) is to plug Eqs.
(10) into the general relations (9.121), and carry out the required temporal and spatial differentiations directly,
using Eq. (16) and its spatial counterpart (which may be derived absolutely similarly):
n
t ret .
c 1 β n ret
Chapter 10 Page 5 of 40
Essential Graduate Physics EM: Classical Electrodynamics
y
r
Rret c(t t ret )
b
n ret
r' (tret ) β r' (t ) Fig. 10.3. The linearly moving
0 ut x charge problem.
ut ret u(t t ret )
Let us select the coordinate axes and the time origin as shown in Fig. 3, and make a clear
distinction between the actual position, r’ (t) = {ut, 0, 0} of the charged particle at the instant t we are
considering, and its position r’(tret) at the retarded instant defined by Eq. (5), i.e. the moment when the
particle’s field had to be radiated to reach the observation point r at the given time t, propagating with
the speed of light. In these coordinates
β , 0, 0, r 0, b, 0, r' (t ret ) ut ret , 0, 0, n ret cos , sin , 0, (10.21)
with cos = –utret/Rret, so [(n – )x]ret = –utret/Rret – , and Eq. (19) yields, in particular:
q ut ret / Rret β q ut ret βRret
Ex
4 0 2 1 β n R 2 3
ret
4 0 2 1 β n 3 R 3 ret
. (10.22)
But according to Eq. (5), the product Rret may be represented as c(t – tret) u(t – tret). Plugging
this expression into Eq. (22), we may eliminate the explicit dependence of Ex on time tret:
q ut
Ex . (10.23)
4 0 1 β n R 3ret
2
The only non-zero transverse component of the field also has a similar form:
q sin q b
Ey 2 3 2
, (10.24)
4 0 1 β n R ret 4 0 1 β n R 3ret
2
while Ez = 0. From Fig. 3, – nret = cos = –utret/Rret, so (1 – n)Rret Rret + utret, and we may again
use Eq. (5) to get (1 – n)Rret = c(t – tret) + utret ct – ctret/2. What remains is to calculate tret from the
self-consistency equation (5), whose square in our current case (Fig. 3) takes the form
2
Rret b 2 (ut ret ) 2 c 2 (t t ret ) 2 . (10.25)
This is a simple quadratic equation for tret, which (with the appropriate negative sign before the square
root, to get tret < t) yields:
t ret 2 t 2 t 2
2 t 2 b2 / c2
1/ 2
2t
c
u 2
t b2
2 2
1/ 2
, (10.26)
(1 β n) R ret
c
u 2 2 2
t b2
1/ 2
, (10.27)
2
Chapter 10 Page 6 of 40
Essential Graduate Physics EM: Classical Electrodynamics
q ut q b
Ex , Ey , Ez 0 . (10.28)
4 0 b 2 2 u t
2 2 3/ 2 4 0 b 2 2 u 2 t 2 3 / 2
But these are exactly Eqs. (9.139),7 which had been obtained in Sec. 9.5 by much simpler means,
without the necessity to solve the self-consistency equation (5). However, that alternative approach was
essentially based on the inertial motion of the particle, and cannot be used in problems in which it
moves with acceleration. In such problems, the second term in Eq. (19), dropping with distance more
slowly, as 1/Rret, and hence describing wave radiation, is frequently the most important one.
At sufficiently large distances from the particle, i.e. in the limit Rret (in the radiation zone), the
contribution of the first (essentially, the Coulomb-field) term in the square brackets of Eq. (19) vanishes
as 1/R2, and the substitution of the remaining term into Eqs. (20) and then Eq. (29) yields the following
formula, which is valid for an arbitrary law of the particle’s motion:9
Radiation
dP Z 0 q 2 n n β β
2
power
. (10.30)
density
dΩ 4 2 1 n β 5
Now, let us apply this important result to some simple cases. First of all, Eq. (30) says that a
charge moving with a constant velocity does not radiate at all. This might be expected from our
analysis of this case in Sec. 9.5 because in the reference frame moving with the charge it produces only
the Coulomb electrostatic field, i.e. no radiation.
Next, let us consider a linear motion of a point charge with a non-zero acceleration directed
along the straight line of the motion. In this case, with the coordinate axes selected as shown in Fig. 4a,
each of the vectors involved in Eq. (30) has at most two non-zero Cartesian components:
7 A similar calculation of magnetic field components from Eq. (20) gives results identical to Eqs. (9.140).
8 This tradition may be reasonably justified. Indeed, we may say that the radiation field “detaches” from the
particle at times close to tret, while the observation time t depends on the detector’s position, and hence is less
relevant for the radiation process as such.
9 If the direction of radiation, n, does not change in time, this formula does not depend on the observer’s position
R. Hence, from this point on, the index “ret” may be safely dropped for brevity, though we should always
remember that in Eq. (30) is the reduced velocity of the particle at the instant of the radiation’s emission, not of
its observation.
Chapter 10 Page 7 of 40
Essential Graduate Physics EM: Classical Electrodynamics
n sin , 0, cos , β 0, 0, ,
β 0, 0, , (10.31)
where is the angle between the directions of the particle’s motion and of the radiation’s propagation.
Plugging these expressions into Eq. (30) and performing the vector multiplications, we readily get
dP Z 0 q 2 2 sin 2
. (10.32)
dΩ 4 2 1 cos 5
Figure 4b shows the angular distribution of such radiation, for three values of the particle’s speed u.
(a) (b)
x n
If the speed is relatively low (u << c, i.e. << 1), the denominator in Eq. (32) is very close to 1
for all observation angles , so the angular distribution of the radiation power is close to sin2 – just as
it follows from the general non-relativistic Larmor formula (8.26), for our current case with = .
However, as the velocity is increased, the denominator becomes less than 1 for < /2, i.e. for the
forward-looking directions, and larger than 1 for back directions. As a result, the radiation in the
direction of the particle’s motion is increased (somewhat counter-intuitively, regardless of the
acceleration’s sign!), while that in the back direction is suppressed. For ultra-relativistic particles (
1), this trend is strongly exacerbated, and radiation to very small forward angles dominates. To describe
this main part of the angular distribution, we may expand the trigonometric functions of participating
in Eq. (32) in the Taylor series in small , and keep only their leading terms: sin , cos 1 – 2/2,
so (1 – cos) (1 + 2 2)/22. The resulting expression,
dP 2Z 0 q 2 2 8
2
, for 1 , (10.33)
dΩ 2 1 2 2 5
describes a narrow “hollow cone” distribution of radiation, with its maximum at the angle
1
0 1 . (10.34)
2
Another important aspect of Eq. (33) is how extremely fast (as 8) the radiation density grows with the
Lorentz factor , i.e. with the particle’s energy E = mc2.
Still, the total radiated power P (into all observation angles) at linear acceleration is not too high
for any practicable values of parameters. To show this, let us first calculate P for an arbitrary motion of
the particle. To start, let me demonstrate how P may be found (or rather guessed) from the general
relativistic arguments. In Sec. 8.2, we have derived Eq. (8.27) for the power of the electric dipole
radiation for a non-relativistic particle motion. That result is valid, in particular, for one charged particle,
Chapter 10 Page 8 of 40
Essential Graduate Physics EM: Classical Electrodynamics
whose electric dipole moment’s derivative over time may be expressed as d(qr)/dt = (q/m)p, where p is
the particle’s linear mechanical momentum (not its electric dipole moment). As a result, the Larmor
formula (8.27) in free space, i.e. with v = c (but u << c) reduces to
2
Z 0 q dp Z0q2 d p dp
P , for u c . (10.35)
6c 2 m dt 6m 2 c 2 dt dt
This is evidently not a Lorentz-invariant result, but it gives a clear hint of how such an invariant, which
would be reduced to Eq. (35) in the non-relativistic limit, may be formed:
Z q2 dp dp Z0q2 dp 2 1 dE 2
P 02 2 2 . (10.36)
6m c d d 6m c d c d
2 2
Using the relativistic expressions p = mc, E = mc2, and d = dt/, the last formula may be recast into
the so-called Liénard extension of the Larmor formula:10
Z0q 2 6 2
Z6q
4 β 2 β β .
2
Total 2 2 2
radiation P β β β 0
(10.37)
power via 6
It may be also obtained by direct integration of Eq. (30) over the full solid angle, thus confirming our
guess.
However, for some applications, it is beneficial to express P via the time evolution of the
particle’s momentum alone. For that, we may differentiate the fundamental relativistic relation (9.78),
E 2 = (mc2)2 + (pc)2, over the proper time to get
dE dp dE c 2 p dp dp
2E 2c 2 p , i.e. u , (10.38)
d d d E d d
where the last step used the relativistic relation c2p/E = u mentioned in Sec. 9.3. Plugging Eq. (38) into
Eq. (36), we may rewrite it as
Total Z0q 2 dp 2 2 dp
2
radiation P . (10.39)
6m 2 c 2 d d
power via p
Please note the difference between the squared derivatives in this expression: in the first of them we
have to differentiate the momentum’s vector p first, and only then form a scalar by squaring the
resulting vector derivative, while in the second case, only the magnitude of the vector has to be
differentiated. For example, for circular motion with a constant speed (to be analyzed in detail in the
next section), the second term vanishes, while the first one does not.
However, if we return to the simplest case of linear acceleration (Fig. 4), then (dp/d)2 =
(dp/d)2, and Eq. (39) is reduced to
10 The second form of Eq. (10.37), which is frequently more convenient for applications, may be readily obtained
from the first one by applying MA Eq. (7.7a) to the vector product.
Chapter 10 Page 9 of 40
Essential Graduate Physics EM: Classical Electrodynamics
2 2 2
Z 0 q 2 dp Z 0 q 2 dp 1 Z0q2 dp
P 1 ,
2
(10.40)
6m 2 c 2 d 6m 2 c 2 d 2 6m 2 c 2 dt ret
i.e. formally coincides with the non-relativistic relation (35). To get a better feeling of the magnitude of
this radiation, we may combine Eq. (9.144) with B = 0, and Eq. (9.148) with E u to get dp/dtret =
dE/dz’, where z’ is the particle’s coordinate at the moment tret. The last relation allows us to rewrite Eq.
(40) in the following form:
2
Z 0 q 2 dE Z 0 q 2 dE dE dt ret Z 0 q 2 dE dE
P . (10.41)
6m 2 c 2 dz 6m 2 c 2 dz' dt ret dz' 6m 2 c 2 u dz' dt ret
For the most important case of ultra-relativistic motion (u c), this result reduces to
P 2 d (E / mc 2 )
, (10.42)
dE / dt ret 3 d ( z' / rc )
where rc is the classical radius of the particle, defined by Eq. (8.41). This formula shows that the
radiated power, i.e. the change of the particle’s energy due to radiation, is much smaller than that due to
the accelerating field unless energy as large as ~mc2 is gained on the classical radius of the particle. For
example, for an electron, with rc 310-15 m and mc2 = mec2 0.5 MeV, such an acceleration would
require the accelerating electric field of the order of (0.5 MV)/(310-15 m) ~ 1014 MV/m, while
practicable accelerating fields are below 102 MV/m – limited by the electric breakdown effects. (As
described by the factor m2 in the denominator of Eq. (41), for heavier particles such as protons, the
relative losses are even lower.) Such negligible radiative losses of energy are actually a large advantage
of linear accelerators – such as the famous two-mile-long SLAC,11 which can accelerate electrons or
positrons to energies up to 50 GeV, i.e. to 105. If obtaining radiation from the accelerated particles is
the goal, it may be readily achieved by bending their trajectories using additional magnetic fields – see
the next section.
Chapter 10 Page 10 of 40
Essential Graduate Physics EM: Classical Electrodynamics
dp u mc 2
c p p 2 , (10.44)
dt ret R R
(where R is the orbit’s radius), so for the power of this synchrotron radiation, Eq. (43) yields
Synchrotron
Z0q 2 4 4 c2 1 2 q4B2 2 2
radiation: P . (10.45)
total power 6 R 2 4 0 3 m 2 c
Note that for ultrarelativistic particles ( 1), the power grows as 2, i.e. as the square of the
particle’s energy E . For example, for typical parameters of the first electron cyclotrons (such as the
General Electric’s machine in which the synchrotron radiation was first noticed in 1947), R ~ 1 m, E ~
0.3 GeV ( ~ 600), Eq. (45) gives a very modest electron energy loss per one revolution: PT P(2R/u)
2PR/c ~ 1 keV. However, already by the mid-1970s, electron accelerators, with R ~ 100 m, could
give each particle energy E ~10 GeV, and the energy loss per revolution grew to ~ 10 MeV, becoming
the major energy loss mechanism. For proton accelerators, such energy loss is much less of a problem,
because the of an ultra-relativistic particle (at fixed E) is proportional to 1/m, so the estimates, at the
same R, should be scaled back by (mp/me)4 ~ 1013. Nevertheless, in the giant modern accelerators such as
the LHC (with R 4.3 km and E up to 7 TeV), the synchrotron radiation loss per revolution is rather
noticeable (PT ~ 6 keV), leading not as much to particle deceleration as to a substantial photoelectron
emission from the beam tube’s walls, creating harmful defocusing effects.
However, what is bad for particle accelerators and storage rings is good for the so-called
synchrotron light sources – the electron accelerators designed for the generation of intensive
synchrotron radiation – with the spectrum extending well beyond the visible light range. Let us analyze
the angular and spectral distributions of such radiation. To calculate the angular distribution, let us select
the coordinate axes as shown in Fig. 5, with the origin at the current location of the orbiting particle, the
z-axis directed along its instant velocity (i.e. the vector ), and the x-axis, toward the orbit’s center.
y
n
P
x z
β β Fig. 10.5. The synchrotron
radiation problem’s geometry.
0
In the general case, when the unit vector n toward the radiation’s observer is not within any of
the coordinate planes, it has to be described by two angles – the polar angle , and the azimuthal angle
between the x-axis and the projection 0P of the vector n onto the [x, y]-plane. Since the length of the
segment 0P is sin, the Cartesian components of the relevant vectors are as follows:
n sin cos , sin sin , cos , β 0, 0, ,
and β , 0, 0 . (10.46)
Plugging these expressions into the general Eq. (30), we get
Chapter 10 Page 11 of 40
Essential Graduate Physics EM: Classical Electrodynamics
dP 2 Z 0 q 2 2 6
β f , , where Synchrotron
dΩ 2 radiation:
(10.47)
1 sin 2 cos 2 angular
f , 6 1 ,
distribution
8 (1 cos ) 3 2 (1 cos ) 2
According to this result, just as at the linear acceleration, in the ultra-relativistic limit, most
radiation goes into a narrow cone (of a width ~ -1 << 1) around the vector , i.e. around the instant
direction of the particle’s propagation. For such small angles, and >> 1,
1 4 2 2 cos 2
f , 1 . (10.48)
(1 2 2 ) 3 (1 2 2 ) 2
The left panel of Fig. 6 shows a color-coded contour map of this angular distribution f(, ), as observed
on a distant plane normal to the particle’s instant velocity (in Fig. 5, parallel to the [x, y]-plane), while
its right panel shows the factor f as a function of in two perpendicular directions: within the particle’s
rotation plane (in the direction parallel to the x-axis, i.e. at = 0) and perpendicular to this plane (along
the y-axis, i.e. at = /2). The result shows, first of all, that, in contrast to the case of linear
acceleration, the narrow radiation cone is now not hollow: the intensity maximum is reached at = 0,
i.e. exactly in the direction of the particle’s motion direction. Second, the radiation cone is not axially
symmetric: within the particle rotation plane, the intensity drops faster (and even has nodes at = 1/).
0.8
sin 0.6
f ( , )
0.4 off-plane
( = /2)
0.2
in-plane
( = 0)
0
0 0.5 1 1.5 2
cos
Fig. 10.6. The angular distribution of the synchrotron radiation at >> 1.
The angular distribution (47) of the synchrotron radiation was calculated for the (inertial)
reference frame whose origin coincides with the particle’s position at this particular instant, i.e. its
radiation pattern is time-independent in the frame moving with the particle. This pattern enables a semi-
quantitative description of the radiation by an ultra-relativistic particle from the point of view of a
stationary observer: if the observation point is on (or very close to) the rotation plane,12 it is being
12 It is easy (and hence is left for the reader’s exercise) to show that if the observation point is much off-plane
(say, is located on the particle orbit’s axis), the radiation is virtually monochromatic, with frequency c. (As we
know from Sec. 8.2, in the non-relativistic limit u << c, this is true for any observation point.)
Chapter 10 Page 12 of 40
Essential Graduate Physics EM: Classical Electrodynamics
“struck” by the narrow radiation cone once each rotation period T 2R/c, each “strike” giving a field
pulse of a short duration tret << 1/c – see Fig. 7.13
(a) (b)
0 T 2 / c
β(t 2 )
n t
r' (t1 ) r' (t2 )
β (t 1 ) t ~ T / 3
Fig. 10.7. (a) The synchrotron radiation cones (at >> 1) for two close values of tret, and (b) the in-plane
component of the electric field observed in the rotation plane, as a function of time t – schematically.
The evaluation of the time duration t of each pulse requires some care: its estimate tret ~ 1/c
is correct for the duration of the retarded time interval during which its cone is aimed at the observer.
However, due to the time compression effect discussed in detail in Sec. 1 and described by Eq. (16), the
pulse duration as seen by the observer is a factor of 1/(1 – ) shorter, so
1 1
t (1 )t ret ~ ~ ~ 3T , for 1 . (10.49)
c c
3
From the Fourier theorem, we can expect the frequency spectrum of such radiation to consist of
numerous (N ~ 3 >> 1) harmonics of the particle rotation frequency c, with comparable amplitudes.
However, if the orbital frequency fluctuates even slightly (c/c > 1/N ~ 1/3), as it happens in most
practical systems, the radiation pulses are not coherent, so the average radiation power spectrum may be
calculated as that of one pulse, multiplied by the number of pulses per second. In this case, the spectrum
is continuous, extending from low frequencies all the way to approximately
max ~ 1 / t ~ 3 c . (10.50)
In order to verify and quantify this result, let us calculate the spectrum of radiation due to a
single pulse. For that, we should first make the general notion of the radiation spectrum quantitative. Let
us represent an arbitrary electric field (say that of the synchrotron radiation we are studying now)
observed at a fixed point r, as a function of the observation time t, as a Fourier integral:14
Et E e it dt . (10.51)
13 The fact that the in-plane component of each electric field’s pulse E(t) is antisymmetric with respect to its
central point, and hence vanishes at that point (as Fig. 7b shows), readily follows from Eq. (19).
14 In contrast to the single-frequency case (i.e. a monochromatic wave), we may avoid taking the real part of the
complex function (Ee-it) by requiring that in Eq. (51), E- = E*. However, it is important to remember the
factor ½ required for the transition to a monochromatic wave of frequency 0 and with real amplitude E0: E = E0
[( – 0) + ( + 0)]/2.
Chapter 10 Page 13 of 40
Essential Graduate Physics EM: Classical Electrodynamics
This expression may be plugged into the formula for the total energy of the radiation pulse (i.e. of the
loss of particle’s energy E) per unit solid angle:15
dE R2
S n (t ) R 2 dt E(t )
2
dt . (10.52)
dΩ Z0
But the inner integral (over t) is just 2( + ’).16 This delta function kills one of the frequency
integrals (say, one over ’), and Eq. (53) gives us a result that may be recast as
dE 4R 2 4R 2
dΩ
I d,
0
with I
Z0
E E
Z0
E E* , (10.54)
where the evident frequency symmetry of the scalar product EE- has been utilized to fold the integral
of I() to positive frequencies only. The first of Eqs. (54) makes the physical sense of the function I()
very clear: this is the so-called spectral density of the electromagnetic radiation (per unit solid angle).17
To calculate the spectral density, we can express the function E via E(t) using the Fourier
transform reciprocal to Eq. (51):
1
E E(t )e it dt . (10.55)
2
In the particular case of radiation by a single point charge, we may use here the second (radiative) term
of Eq. (19):
1 q 1 n (n β) β it
2 4 0 cR (1 β n) 3 ret
E e dt . (10.56)
Since the vectors n and are more natural functions of the radiation’s emission (retarded) time tret, let us
use Eqs. (5) and (16) to exclude the observation time t from this integral:
q
1 1 n (n β) β
Rret
E
4 0 2 cR (1 β n) 2 expi t ret
ret c
dt ret .
(10.57)
Assuming that the observer is sufficiently far from the particle,18 we may treat the unit vector n as a
constant and also use the approximation (8.19) to reduce Eq. (57) to
15 Note that the expression under this integral differs from dP/d defined by Eq. (29) by the absence of the term
(1 – n) = tret/t – see Eq. (16). This is natural because now we are calculating the wave energy arriving at the
observation point r during the time interval dt rather than dtret.
16 See, e.g. MA Eq. (14.4).
17 The notion of spectral density may be readily generalized to random processes – see, e.g., SM Sec. 5.4.
18 According to the estimate (49), for a synchrotron radiation pulse, this restriction requires the observer to be
much farther than r’ ~ ct ~ R/3 from the particle. With the values R ~ 104 m and ~ 105 mentioned above, r’
~ 10-11 m, so this requirement is satisfied for any realistic radiation detector.
Chapter 10 Page 14 of 40
Essential Graduate Physics EM: Classical Electrodynamics
q 1 1
ir n (n β) β
n r'
E exp expi t dt ret . (10.58)
4 0 2 cR c (1 β n)
2
c ret
Plugging this expression into Eq. (54), and then using the definitions c 1/(00)1/2 and Z0 (0/0)1/2,
we get19
2
Z 0 q 2 n (n β) β n r'
I
16 3 (1 β n) 2
expi t dt ret . (10.59)
c ret
This result may be further simplified by noticing that the fraction before the exponent may be
represented as a full derivative over tret,
n (n β) β n (n β) dβ/dt d n (n β)
, (10.60)
(1 β n) (1 β n) ret dt 1 β n ret
2 2
ret
and working out the resulting integral by parts. At this operation, the time differentiation of the
parentheses in the exponent gives d[tret – nr’(tret)/c]/dtret = (1 – nu/c)ret (1 – n)ret, leading to the
cancellation of the remaining factor in the denominator and hence to a very simple general result: 20
2
Relativistic
Z q 2 2 n r'
radiation: I 0 3 n (n β) expi t c dt ret . (10.61)
spectral
density
16 ret
Now returning to the particular case of the synchrotron radiation, it is beneficial to choose the
origin of time tret so that at tret = 0, the angle between the vectors n and takes its smallest value 0,
i.e., in terms of Fig. 5, the vector n is within the [y, z]-plane. Fixing this direction of the axes so that they
do not move, we can redraw that figure as shown in Fig. 8.
y
n
0 β ret
x c t ret Fig. 10.8. Deriving the synchrotron radiation’s
R z spectral density. The vector n is static within
the [y, z]-plane, while the vectors r’(tret) and
r' (t ret ) ret rotate, within the [x, z]-plane, with the
0 angular velocity c of the particle.
In this “lab” reference frame, the vector n does not depend on time, while the vectors r’(tret) and
ret do depend on it via the angle ctret:
19 Note that for our current purposes of calculation of the spectral density of radiation by a single particle, the
factor exp{ir/c} got canceled. However, as we have seen in Chapter 8, this factor plays a central role in the
interference of radiation from several (many) sources. Such interference is important, in particular, in undulators
and free-electron lasers – the devices to be (qualitatively) discussed below.
20 Actually, this simplification is not occasional. According to Eq. (10b), the expression under the derivative in
the last form of Eq. (60) is just the transverse component of the vector potential A (give or take a constant factor),
and from the discussion in Sec. 8.2 we know that this component determines the electric dipole radiation of a
system, which dominates the radiation in our current case of a single particle with a non-zero electric charge.
Chapter 10 Page 15 of 40
Essential Graduate Physics EM: Classical Electrodynamics
n 0, sin 0 , cos 0 , r' t ret R1 cos , 0, R sin , β ret sin , 0, cos . (10.62)
Now an easy multiplication yields
n (n β)ret
sin , sin 0 cos 0 cos , sin 2 0 sin , (10.63)
n r' R
expi t expi t ret cos 0 sin . (10.64)
c ret c
As we already know, in the (most interesting) ultra-relativistic limit >> 1, most radiation is confined to
short pulses, so only small angles ~ ctret ~ –1 may contribute to the integral in Eq. (61). Moreover,
since most radiation goes to small angles ~ 0 ~ –1, it makes sense to consider only such small
angles. Expanding both trigonometric functions of these small angles, participating in parentheses of Eq.
(64), into the Taylor series, and keeping only the leading terms, we get
R R R 02 R c3 3
t ret cos 0 sin t ret c t ret c t ret t ret . (10.65)
c c c 2 c 6
Since (R/c)c = u/c = 1, in the two last terms, we may approximate this parameter by 1. However, it
is crucial to distinguish the difference between the two first terms, proportional to (1 – )tret, from zero;
as we have done before, we may approximate it with tret/22. On the right-hand side of Eq. (63), which
does not have such a critical difference, we may be bolder, taking21
I
Z0q2
16 3
axn x a yn y
2
Z0q2
16
3
2
ax a y
2
, (10.67)
21This expression confirms that the in-plane (x) component of the electric field is an odd function of tret and hence
of t – t0 (see its sketch in Fig. 7b), while the normal (y) component is an even function of this difference. Also,
note that for an observer exactly in the rotation plane (0 = 0) the latter component equals zero for all times – the
fact which could be predicted from the very beginning because of the evident mirror symmetry of the problem
with respect to the particle’s rotation plane.
Chapter 10 Page 16 of 40
Essential Graduate Physics EM: Classical Electrodynamics
which is proportional to the observation frequency, and changing the integration variable to
ctret/(02 + -2)1/2, the integrals (68) may be reduced to the modified Bessel functions of the second kind,
but with fractional indices:
2
3 3
ax 0 2 exp i d
2 3i
K ,
c 2 3 02 2 1 / 2 2 / 3 (10.70)
3 3 2 3 0
0 02 2 exp i d 2 K 1 / 3
1/ 2
ay
c 2 3 0 2
Figure 9a shows the dependence of the Bessel factors defining the amplitudes ax and ay on the
normalized observation frequency . It shows that the radiation intensity changes with frequency
relatively slowly (note the log-log scale of the plot!) until the normalized frequency defined by Eq. (69)
is increased beyond ~1. For the most important observation angles 0 ~ , this means that our estimate
(50) is indeed correct, though formally the frequency spectrum extends to infinity.22
(a) (b)
1
K 2 / 3 ( )
0.8 K 5 / 3 ( ) d
0.6
0.1
K 1 / 3 ( )
0.4
0.2
0.01
0.01 0.1 1 10 0.01 0.1 1 10
Fig. 10.9. The frequency spectra of: (a) two components of the synchrotron radiation, at a fixed angle 0, and
(b) its total (polarization- and angle-averaged) intensity.
Naturally, the spectral density integrated over the full solid angle exhibits a similar frequency
behavior. Without performing the integration,23 let me just give the result (also valid for >> 1 only) for
the reader’s reference:
3 2 2
4 I d q K 5 / 3 d , where . (10.71)
4 3 c 3
Figure 9b shows the dependence of this integral on the normalized frequency . (This plot is sometimes
called the “universal flux curve”.) In accordance with the estimate (50), it reaches the maximum at
22 The law of the spectral density decrease at large may be readily obtained from the second of Eqs. (2.158),
which is valid even for any (even non-integer) Bessel function index n: ax ay -1/2exp{-}. Here the
exponential factor is certainly the most important one.
23 For that, and many other details, the interested reader may be referred, for example, to the fundamental review
collection by E. Koch et al. (eds.) Handbook on Synchrotron Radiation (in 5 vols.), North-Holland, 1983-1991, or
to a more concise monograph by A. Hofmann, The Physics of Synchrotron Radiation, Cambridge U. Press, 2007.
Chapter 10 Page 17 of 40
Essential Graduate Physics EM: Classical Electrodynamics
c
max 0.3, i.e. max 3. (10.72)
2
For example, in the National Synchrotron Light Source (NSLS-II) in the Brookhaven National
Laboratory near our SBU campus, with its ring’s circumference of 792 m, the electron revolution period
T is 2.64μs. With c = 2π/T 2.4106 s-1, for the achieved 6103 (E 3 GeV), we get max ~
31017 s-1, i.e. the photon energy max ~ 200 eV corresponding to soft X-rays. In light of this estimate,
the reader may be surprised by Fig. 10, which shows the calculated spectra of the radiation that this
facility was designed to produce, with the intensity maxima at photon energies up to a few keV.
Fig. 10.10. Design brightness of various synchrotron radiation sources of the NSLS-II facility.
For the bend magnets and wigglers, the “brightness” may be obtained by multiplication of the
one-pulse spectral density I() calculated above, by the number of electrons passing the source
per second. (Note the non-SI units used by the synchrotron radiation community.) However, for
undulators, there is an additional factor due to the partial coherence of radiation – see below.
(Adapted from the document NSLS-II Source Properties and Floor Layout that was available
online at https://www.bnl.gov/ps/docs/pdf/SourceProperties.pdf in 2011-2020.)
The reason for this discrepancy is that in the NLLS-II, and in all modern synchrotron light
sources, most radiation is produced not by the circular orbit itself (which is, by the way, not exactly
Chapter 10 Page 18 of 40
Essential Graduate Physics EM: Classical Electrodynamics
circular, but consists of a series of straight and bend-magnet sections), but by such bend sections, and
the devices called wigglers and undulators: strings of several strong magnets with alternating field
direction (Fig. 11), that induce periodic bending (wiggling”) of the electron’s trajectory, with the
synchrotron radiation emitted at each bend.
u X ray
beam
electrons
The difference between the wigglers and the undulators is more quantitative than qualitative: the
former devices have a larger spatial period u (the distance between the adjacent magnets of the same
polarity, see Fig. 11), giving enough space for the electron beam to bend by an angle larger than –1, i.e.
larger than the radiation cone’s width. As a result, the radiation reaches an in-plane observer as a
periodic sequence of individual pulses – see Fig. 12a.
t u / 2 2 c (a) (b)
t
t t
Fig. 10.12. Waveforms of the radiation emitted by
(a) a wiggler and (b) an undulator – schematically.
The shape of each pulse, and hence its frequency spectrum, are essentially similar to those
discussed above,24 but with much higher local values of c and hence max – see Fig. 10. Another
difference is a much higher frequency of the pulses. Indeed, the fundamental Eq. (16) allows us to
calculate the time distance between them, for the observer, as
t 1
t t ret 1 u 2 u u , (10.73)
t ret u 2 c c
24 Indeed, the period u is typically a few centimeters (see the numbers in Fig. 10), i.e. is much larger than the
interval r’ ~ R/3 estimated above. Hence the synchrotron radiation results may be applied locally, to each
electron beam’s bend. (In this context, a simple problem for the reader: use Eqs. (19) and (63) to explain the
difference between shapes of the in-plane electric field pulses emitted at opposite magnetic poles of the wiggler,
which is schematically shown in Fig. 12a.)
Chapter 10 Page 19 of 40
Essential Graduate Physics EM: Classical Electrodynamics
where the first two relations are valid at u << R (the relation typically satisfied very well, see the
numbers in Fig. 10), and the last two relations assume the ultra-relativistic limit. As a result, the
radiation intensity, which is proportional to the number of poles, is much higher than that from the bend
magnets – see Fig. 10 again.
The situation is different in undulators – similar structures with a smaller spatial period u, in
which the electron’s velocity vector oscillates with an angular amplitude smaller than –1. As a result,
the radiation pulses overlap (Fig. 12b), and the radiation waveform is closer to the sinusoidal one. As a
result, the radiation spectrum narrows to the central frequency25
2 2 c
0 2 2 . (10.74)
t u
For example, for the LSNL-II undulators with u = 2 cm, this formula predicts a radiation peak at
phonon energy 0 4 keV, in reasonable agreement with the quantitative calculation results shown in
Fig. 10.26 Due to the spectrum narrowing, the undulator’s radiation intensity is higher than that of
wigglers using the same electron beam.
This spectrum-narrowing trend is brought to its logical conclusion in the so-called free-electron
lasers27 whose basic structure is the same as that of wigglers and undulators (Fig. 11), but the radiation
at each beam bend is so intense and narrow-focused that it affects the electron motion downstream of the
radiation cone. As a result, the radiation spectrum narrows around the central frequency (74), and its
power grows as a square of the number N of electrons in the structure (rather than proportionately to N
in wigglers and undulators).
Finally, note that wigglers, undulators, and free-electron lasers may be also used at the end of a
linear electron accelerator (such as SLAC) which, as was noted above, may provide extremely high
values of , and hence radiation frequencies, due to the smallness of radiation energy losses at the
electron acceleration stage. Very unfortunately, I do not have time/space to discuss the (very interesting)
physics of these devices in more detail.28
25 This important formula may be also derived in the following way. Due to the relativistic length contraction
(9.20), the undulator structure period as perceived by beam electrons is ’ = u/, so the central frequency of the
radiation in the reference frame moving with the electrons is 0’ = 2c/’ = 2c/u. For the lab-frame observer,
this frequency is Doppler-upshifted in accordance with Eq. (9.44): 0 = 0’[(1 + )/(1 – )]1/2 20’, giving the
same result as Eq. (74).
26 Some of the difference is due to the fact that those plots show the spectral density of the number of photons n =
E/ per second, which peaks at a frequency below that of the density of power, i.e. of the energy E per second.
27 This name is somewhat misleading, because in contrast to the usual (“quantum”) lasers, a free-electron laser is
essentially a classical device, and the dynamics of electrons in it is very similar to that in vacuum-tube microwave
generators, such as the magnetrons briefly discussed in Sec. 9.6.
28 The interested reader may be referred, for example, to either P. Luchini and H. Motz, Undulators and Free-
electron Lasers, Oxford U. Press, 1990; or E. Salin et al., The Physics of Free Electron Lasers, Springer, 2000.
Chapter 10 Page 20 of 40
Essential Graduate Physics EM: Classical Electrodynamics
effect, traditionally called by its German name bremsstrahlung (“brake radiation”), is responsible, in
particular, for the continuous part of the frequency spectrum of the radiation produced in standard
vacuum X-ray tubes, at the electron collisions with a metallic “anticathode”. 29
The bremsstrahlung in condensed matter is generally a rather complicated phenomenon because
of the simultaneous involvement of many particles, and (frequently) some quantum electrodynamic
effects. This is why I will give only a very brief glimpse at the theoretical description of this effect, for
the simplest case when the scattering of incoming, relatively light charges (such as electrons, protons, -
particles, etc.) is produced by atomic nuclei, which remain virtually immobile during the scattering
event (Fig. 13a). This is a reasonable approximation if the energy of incoming particles is not too low;
otherwise, most scattering is produced by atomic electrons whose dynamics is substantially quantum –
see below.
y
(a) (b)
Fig. 10.13. The basic
geometry of the
q', m' β fin p fin bremsstrahlung and the
b
q, m q Coulomb loss problems
β ini ' ' in the (a) direct and (b)
0 x p ini reciprocal spaces.
To calculate the frequency spectrum of the radiation emitted during a single scattering event, it is
convenient to use a byproduct of the last section’s analysis, namely Eq. (59) with the replacement (60):30
2
q2 1 d n (n β) n r'
I dt 1 β n expi t c dt ret . (10.75)
4 0 4 2 c ret
A typical duration of a single scattering event we are discussing is of the order of a0/c ~ (10-10
m)/(3108 m/s) ~ 10-18 s in solids, and only an order of magnitude longer in gases at ambient conditions.
This is why for most frequencies of interest, from zero all the way up to at least soft X-rays,31 we can
use the so-called low-frequency approximation, taking the exponent in Eq. (75) for 1 through the whole
collision event, i.e. the integration interval. This approximation immediately yields
1 n n β fin n n β ini
Brems- 2
q2
I
strahlung:
single . (10.76)
collision 4 0 4 2 c 1 β fin n 1 β ini n
29 Such X-ray radiation had been first observed experimentally (though not correctly interpreted) by N. Tesla in
1887, i.e. before it was rediscovered and studied in detail by W. Röntgen.
30 In publications on this topic (whose development peak was in the 1920s-1930s), the Gaussian units are more
common, and the uppercase letter Z is usually reserved for expressing charges as multiples of the fundamental
charge e, rather than for the wave impedance. This is why, in order to avoid confusion and facilitate the
comparison with other texts, in this section I (while still staying with the SI units used throughout my series) will
use the fraction 1/0c, instead of its equivalent Z0, for the free-space wave impedance, and write the coefficients in
a form that makes the transfer to the Gaussian units elementary: it is sufficient to replace all (qq’/40)SI with
(qq’)Gaussian. In the (rare) cases when I spell out the charge values, I will use a different font: q Ze, q’ Z’e.
31 A more careful analysis shows that this approximation is actually quite reasonable up to much higher
Chapter 10 Page 21 of 40
Essential Graduate Physics EM: Classical Electrodynamics
In the non-relativistic limit (ini, fin << 1), this formula is reduced to the following result:
q2 1 q2
I sin 2 (10.77)
4 0 4 2 c m 2 c 2
(which may be derived from Eq. (8.27) as well), where q is the momentum transferred from the scattering
center to the scattered charge (Fig. 13b):32
and (not to be confused with the particle scattering angle ’ shown in Fig. 13!) is the angle between
the vector q and the direction n toward the observer – at the collision moment.
The most important feature of the result (77)-(78) is the frequency-independent (“white”)
spectrum of the radiation, very typical for any rapid pulses that may be approximated as delta functions
of time.33 (Note, however, that Eq. (77) implies a fixed value of q, so the statistics of this parameter, to
be discussed in a minute, may “color” the radiation.)
Note also the “doughnut-shaped” angular distribution of the radiation, typical for non-relativistic
systems, with the symmetry axis directed along the momentum transfer vector q. In particular, this
means that in typical cases when ’ << 1, i.e. q << p, when the vector q is nearly normal to the vector
pini (see, e.g., the example shown in Fig. 13b), the bremsstrahlung produces a significant radiation flow
in the direction back to the particle source – the fact significant for the operation of X-ray tubes.
Now integrating Eq. (77) over all wave propagation angles, just as we did for the instant
radiation power in Sec. 8.2, we get the following spectral density of the particle energy loss,
dE 2 q2 q2
I dΩ . (10.79)
d 4 3c 4 2 0 m 2 c 2
In most applications of the bremsstrahlung theory (as in most scattering problems34), the impact
parameter b (Fig. 13a), and hence the scattering angle ’ and the transferred momentum q, have to be
32 Please note the font-marked difference between this variable (q ) and the particle’s electric charge (q).
33 This is the basis, in particular, of the so-called High-Harmonic Generation (HHG) effect, discovered in 1977,
which takes place at the irradiation of gases by intensive laser beams. The high electric field of the beam strips
electrons from atoms, and accelerates them away from the remaining ions, just to slam them back into the same
ions as the field’s polarity changes in time. The electrons change their momentum sharply during their
recombination with the ions, resulting in bremsstrahlung-like emission of short radiation pulses. The spectrum of
radiation from each such pulse obeys Eq. (77), but since the ionization/acceleration/recombination cycles repeat
periodically with the frequency 0 of the laser field, the final spectrum consists of many equidistant lines, with
frequencies n0. The classical theory of the bremsstrahlung does not give a cutoff max = nmax0 of the spectrum;
such a limit is imposed by quantum mechanics: max max + 3Ep, where the so-called ponderomotive energy Ep
= (eE0/0)2/4me is the average kinetic energy given to a free electron by the periodic electric field of the laser
beam, with amplitude E0 – see, for example, M. Lewenstein et al., Phys. Rev. A 49, 2117 (1994). In practice, the
HHG pulses may be shorter than 10-15 s, and nmax as high as ~100, enabling numerous applications of this effect.
34 See, e.g., CM Sec. 3.5 and QM Sec. 3.3.
Chapter 10 Page 22 of 40
Essential Graduate Physics EM: Classical Electrodynamics
considered random. For elastic (ini = fin ) Coulomb collisions we can use the so-called Rutherford
formula for the differential cross-section of scattering35
2 2
d qq' 1 1
. (10.80)
dΩ' 4 0 2 pc sin ' / 2
4
Here d = 2bdb is the elementary area of the sample cross-section (as visible from the direction of the
incident particles) corresponding to their scattering into an elementary body angle36
and then plug, instead of qmax and qmin, the scales of the most important effects limiting the range of the
transferred momentum’s magnitude. In the classical-mechanics analysis, according to Eq. (82), qmax = 2p
2mu. To estimate qmin, let us note that the very small momentum transfer takes place when the impact
parameter b is very large, and hence the effective scattering time ~ b/v is very long. Recalling the
condition of the low-frequency approximation, we may associate qmin with ~ 1/ and hence with b ~
35 See, e.g., CM Eq. (3.73) with = qq’/40. In the form used in Eq. (80), the Rutherford formula is also valid
for the small-angle scattering of relativistic particles, the criterion being << 2/.
36 Again, the angle ’ and the differential d’, describing the scattered particles (see Fig. 13) should not be
confused with the parameters and d describing the radiation emitted at the scattering event.
Chapter 10 Page 23 of 40
Essential Graduate Physics EM: Classical Electrodynamics
u ~ v/. Since for the small scattering angles, q is close to the impulse F ~ (qq’/4π0b2) of the
Coulomb force, we get the estimate qmin ~ (qq’/4π0)/u2, and Eq. (85) should be used with
qmax 2mu 3 qq' Classical
ln ln . (10.86) brems-
qmin 4 0
strahlung
This is Bohr’s formula for what is called the classical bremsstrahlung. We see that the low
momentum cutoff indeed makes the spectrum slightly colored, with more energy going to lower
frequencies. There is even a formal divergence at 0; however, this divergence is integrable, so it
does not present a problem for finding the total energy radiative losses (-dE/dx) as an integral of Eq. (86)
over all radiated frequencies . A larger problem for this procedure is the upper integration limit,
, at which the integral diverges. This means that our approximate description, which considers the
collision as an elastic process, becomes invalid and needs to be amended by taking into account the
difference between the initial and final kinetic energies of the particle due to radiation of the energy
quantum of the emitted photon, so
2 2 2 2
pini pfin pini p fin
, i.e. E, E - , . (10.87)
2m 2m 2m 2m
As a result, taking into account that the minimum and maximum values of q correspond to, respectively,
the parallel and antiparallel alignments of the vectors pini and pfin, we get
qmax
ln
pini p fin
ln
pini p fin / 2m
2
E 1 / 2 E 1 / 2
2
Quantum
pini2 pfin2 / 2m ln
ln , (10.88) brems-
qmin pini p fin strahlung
Plugged into Eq. (85), this expression yields the so-called Bethe-Heitler formula for quantum
bremsstrahlung.37 Note that in this approach, qmax is close to that of the classical approximation, but qmin
is of the order of /u, so
qmin classical ZZ'
~ , (10.89)
qmin quantum
where Z and Z’ are the particles’ charges in the units of e, and is the dimensionless fine structure
(“Sommerfeld”) constant,
e2 e2 1
Gaussian 1 , (10.90)
4 0 c
SI
c 137
which is one of the basic notions of quantum mechanics.38 Due to the smallness of the constant, the ratio
(89) is below 1 for most cases of practical interest, and since the integral of (84) over q is limited by the
largest of all possible cutoffs qmin, it is the Bethe-Heitler formula that should be used.
37 The modifications of this formula necessary for the relativistic description are surprisingly minor – see, e.g.,
Chapter 15 in J. Jackson, Classical Electrodynamics, 3rd ed., Wiley 1999. For even more detail, the standard
reference monograph on bremsstrahlung is W. Heitler, The Quantum Theory of Radiation, 3rd ed., Oxford U. Press
1954 (reprinted in 1984 and 2010 by Dover).
38 See, e.g., QM Secs. 4.4, 6.3, 6.4, 9.3, 9.5, and 9.7.
Chapter 10 Page 24 of 40
Essential Graduate Physics EM: Classical Electrodynamics
Now nothing prevents us from calculating the total radiative losses of energy per unit length:
max
E 1 / 2 E 1 / 2
2
dE d 2E 16 q 2 qq' 1
d n 22 ln d , (10.91)
dx 0 ddz 3 4 0 c 4 0 mc 2
0 1 / 2
where max = E is the maximum energy of the radiation quantum. By introducing the dimensionless
integration variable /E = 2/(mu2/2), this integral is reduced to a table one,39 and we get
2 2
dE 16 q 2 qq' 1 u 2 16 q' 2 q 2 1
2
dx
n
3 4 0 c 4 0 mc 2 3 n 4 c 4 2
. (10.92)
0 0 mc
Following my usual style, at this point I would give you an estimate of the losses for a typical
case; however, let me first discuss a parallel particle energy loss mechanism, the so-called Coulomb
losses, due to the transfer of mechanical impulse from the scattered particle to the scattering centers.
(This energy eventually goes into an increase of the thermal energy of the scattering medium, rather
than to the electromagnetic radiation.)
Using Eqs. (9.139) for the electric field of a linearly moving charge q, we can readily find the
momentum it transfers to the counterpart charge q’:40
qq' b qq' 2
p' p' y p ' y dt q'E y dt b dt . (10.93)
4 0
2
2u t
2 2 3/ 2 4 0 bu
Hence, the kinetic energy acquired by the scattering particle (and hence to the loss of the energy E of the
incident particle) is
Δ
2
( p' ) 2 qq' 2
ΔE . (10.94)
2m' 4 0 m'u 2 2
b
Such elementary energy losses have to be summed up over all collisions, with random values of
the impact parameter b. At the scattering center density n, the number of collisions per small path length
dx per small range db is dN = n 2πbdb dx, so
2 b 2
Δ
dE qq' 2 max
db qq' ln B b
Coulomb
losses EdN n 2 4 n , where B max . (10.95)
4 0 4 0
2 2
dx m'u bmin b m'u bmin
Here, at the last step, the logarithmic integral over b was treated similarly to that over q in the
bremsstrahlung theory. This approximation is adequate because the ratio bmax/bmin is much larger than 1.
Indeed, bmin may be estimated from (p’)max ~ p = mu. For this value, Eq. (93) with q’ ~ q gives bmin ~
rc (see Eq. (8.41) and its discussion), which, for elementary particles, is of the order of 10-15m. On the
other hand, for the most important case when the Coulomb energy absorbers are electrons (which,
according to Eq. (94), are the most efficient ones, due to their very low mass m’), bmax may be estimated
from the condition = b/u ~ 1/min, where min ~ 1016 s-1 is the characteristic frequency of electron
Chapter 10 Page 25 of 40
Essential Graduate Physics EM: Classical Electrodynamics
transitions in atoms. (Quantum mechanics forbids such energy transfer at lower frequencies.) From here,
we have the estimate bmax ~ u/min, so
b u
B max ~ , (10.96)
bmin rc min
for ~ 1 and u ~ c 3108 m/s giving bmax ~ 310-8 m, so B ~ 109 (give or take a couple of orders of
magnitude – this does not change the estimate lnB 20 too much). 41
Now we can compare the non-radiative Coulomb losses (95) with the radiative losses due to the
bremsstrahlung, given by Eq. (92):
dE radiation m' 2 1
~ ZZ' , (10.97)
dE Coulomb m ln B
Since ~ 10-2 << 1, for non-relativistic particles ( << 1) the bremsstrahlung losses of energy are much
lower (that is why I did not want to rush with their estimates), and only for ultra-relativistic particles, the
relation may be opposite.
According to Eqs. (95)-(96), for electron-electron scattering (q = q’ = –e, m = m’ = me),42 at the
value n = 61026 m-3 typical for air at ambient conditions, the characteristic length of energy loss,
E
lc , (10.98)
dE / dx
for electrons with kinetic energy E = 6 keV is close to 210-4 m 0.2 mm. (This is why we need high
vacuum in electron microscope columns and other vacuum electron devices.) Since lc E2, more
energetic particles penetrate to matter deeper, until the bremsstrahlung steps in, and limits this trend at
very high energies.
41 A quantum analysis (carried out by Hans Bethe in 1940) replaces, in Eq. (95), lnB with ln(22mu2/) – 2,
where is the average frequency of the atomic quantum transitions weight by their oscillator strength. This
refinement does not change the estimate given below. Note that both the classical and quantum formulas describe
a fast increase (as 1/) of the energy loss rate (-dE/dx) at 1, and its slow increase (as ln) at , so the
losses have a minimum at ( – 1) ~ 1.
42 Actually, the above analysis has neglected the change of momentum of the incident particle. This is legitimate
at m’ << m, but for m = m’ the change approximately doubles the energy losses. Still, this does not change the
order of magnitude of the estimate.
Chapter 10 Page 26 of 40
Essential Graduate Physics EM: Classical Electrodynamics
nb 3 1 , (10.99)
and the treatment of Coulomb collisions as a set of independent events is inadequate. However, this
condition enables the opposite approach: treating the medium as a continuum. In the time-domain
formulation used in the previous sections of this chapter, this would be a very complex problem,
because it would require an explicit description of the medium dynamics. Here the frequency-domain
approach, based on the Fourier transform in both time and space, helps a lot, provided that the functions
() and () are considered known – either calculated or taken from experiment. Let us have a good
look at this approach because it gives some interesting (and practically important) results.
In Chapter 6, we have used the macroscopic Maxwell equations to derive Eqs. (6.118), which
describe the time evolution of electrodynamic potentials in a linear medium with frequency-independent
and . Looking for all functions participating in Eqs. (6.118) in the plane-wave expansion form43
and requiring all coefficients at similar exponents to be balanced, we get their Fourier images: 44
k ,
k 2
2 k ,
, k 2
2 A k , jk , . (10.101)
As was discussed in Chapter 7, in such a Fourier form, the macroscopic Maxwell theory remains valid
even for dispersive (but isotropic and linear!) media, so Eqs. (101) may be generalized as
k ,
k 2
2 ( ) ( ) k ,
( )
, k 2
2 ( ) ( ) A k , ( ) jk , , (10.102)
so the “only” remaining things to do is, first, to calculate the Fourier transforms of the functions (r, t)
and j(r, t), describing stand-alone charges and currents, using the transform reciprocal to Eq. (100), with
one factor 1/2 per each scalar dimension,
1 i (k r t )
f k , d r dt f (r, t )e
3
, (10.104)
(2 ) 4
and then to carry out the integration (100) of Eqs. (103).
For our problem of a single charge q uniformly moving through a medium with velocity u,
(r, t ) q (r ut ), j(r, t ) qu (r ut ) , (10.105)
43All integrals here and below are in infinite limits unless specified otherwise.
44 As was discussed in Sec. 7.2, the Ohmic conductivity of the medium (generally, also a function of frequency)
may be readily incorporated into the dielectric permittivity: () ef() + i()/. In this section, I will assume
that such incorporation, which is especially natural for high frequencies, has been performed, so the current
density j(r, t) describes only stand-alone currents – for example, the current (105) of the incident particle.
Chapter 10 Page 27 of 40
Essential Graduate Physics EM: Classical Electrodynamics
(2 ) 4
(2 ) 4
(2 ) 3
Since the expressions (105) for (r, t) and j(r, t) differ only by a constant factor u, it is clear that the
absolutely similar calculation for the current gives
qu
jk , ( k u) . (10.107)
(2 ) 3
Let us summarize what we have got by now, by plugging Eqs. (106)-(107) into Eqs. (103):
1 q ( k u) 1 ( )qu ( k u)
k , , A k , ( ) ( )uk , .(10.108)
(2 ) ( ) k ( ) ( )
3 2 2
(2 ) 3 k 2 2 ( ) ( )
Now, at the last calculation step, namely the integration (100), we are starting to pay a heavy
price for the easiness of the first steps. This is why let us think well about what exactly we need from it.
First of all, for the calculation of power losses, the electric field is more convenient to use than the
potentials, so let us calculate the Fourier images of E and B. Plugging the expansion (100) into the basic
relations (6.7), and again requiring the balance of exponent’s coefficients, we get
E(r, t ) d 3 k d E k , e i (k r t )
iq ( ) ( )u k ( k u) e i(kr t ) . (10.110)
3
d 3 k d
(2 )
( ) k 2 2 ( ) ( )
This formula may be rewritten as the temporal Fourier integral (51), with the following r-dependent
complex amplitude:
E r E k , e ik r d 3 k
iq ( ) ( )u k ( k u) e ikr d 3 k .
3
(10.111)
(2 )
( ) k 2 2 ( ) ( )
Let us calculate the Cartesian components of this partial Fourier image E, at a point separated
by distance b from the particle’s trajectory. Selecting the coordinates and time origin as shown in Fig. 3,
we have r = {0, b, 0} and u = {u, 0, 0}, so only Ex and Ey are different from zero. In particular,
according to Eq. (111),
( ) ( )u k x
( k x u ) expik y b.
iq
(E x )
(2 ) ( )
3 dk x dk y dk z 2
k 2 ( ) ( )
(10.112)
The delta function kills one integral (over kx) of the three, and we get
( ) ( )u expik y bdk y 2 2
iq dk z
(E x ) . (10.113)
(2 ) ( )u
3
u / u k y k z2 2 ( ) ( )
2
Chapter 10 Page 28 of 40
Essential Graduate Physics EM: Classical Electrodynamics
The internal integral (over kz) may be readily reduced to the table integral d/(1 + 2) in infinite limits,
equal to ,45 and the result represented as
i q 2 expik y b
(E x )
(2 ) ( ) k y2 2 1 / 2
3
dk y ,
(10.114)
where j is the current of the bound charges in the medium, and should not be confused with the stand-
alone incident-particle current (105). This integral may be readily expressed via the partial Fourier
image E and the similarly defined image j., just as it was done at the derivation of Eq. (54):
dE
dt de it d'e i't j E' 2 d d' j E' ( ' ) 2 j E d . (10.119)
dV
Let us incorporate the effective Ohmic conductivity ef() into the complex permittivity () just as
this was discussed in Sec. 7.2, using Eq. (7.46) to write
j ef E i ( )E . (10.120)
As a result, Eq. (119) yields
dE
2 i E E d 4 Im E d .
2
(10.121)
dV 0
(The last step was possible due to the property (–) = *(), which was discussed in Sec. 7.2.)
Chapter 10 Page 29 of 40
Essential Graduate Physics EM: Classical Electrodynamics
Finally, just as in the last section, we have to average the energy loss rate over random values of
the impact parameter b:
Im d . (10.122)
dE dE 2 dE 2
d b 2 bdb 8 bdb E x
2
2
Ey
dx
dV bmin
dV bmin 0
Due to the (weak) divergence of the functions K0() and K1() at 0, we have to cut the resulting
integral over b at some bmin where our theory loses legitimacy. (On that limit, we are not doing much
better than in the past section). Plugging in the calculated expressions (116) and (117) for the field
components, swapping the integrals over and b, and using the recurrence relations (2.142), which are
valid for all Bessel functions, we finally get:
dE 2 2 d
q Im ( *bmin ) K 1 ( *bmin ) K 0 ( *bmin ) . (10.123) Radiation
intensity
dx 0
( )
This general result is valid for a linear medium with arbitrary dispersion relations () and ().
(The last function participates in Eq. (123) only via Eq. (115) that defines the parameter .) To get more
concrete results, some particular model of the medium should be used. Let us explore the Lorentz-
oscillator model that was discussed in Sec. 7.2, in its form (7.33) suitable for the transition to the
quantum-mechanical description of atoms:
nq' 2 fj
( ) 0 ( , with f j 1; 0 . (10.124)
m j
2
j ) 2i j
2
j
If the damping of the effective atomic oscillators is low, j << j, as it typically is, and the particle’s
speed u is much lower than the typical wave’s phase velocity v (and hence than c!), then for most
frequencies Eq. (115) gives
2 1 1 2
2
2 2 , (10.125)
u v ( ) u 2
i.e. * /u is virtually real. In this case, Eq. (123) may be reduced to Eq. (95) with
1.123u
bmax . (10.126)
The good news here is that both approaches (the microscopic analysis of Sec. 4 and the
macroscopic analysis of this section) give essentially the same result. The same fact may be also
perceived as bad news: the treatment of the medium as a continuum does not give any new results here.
The situation somewhat changes at relativistic velocities, at which such treatment provides noticeable
corrections (called density effects), in particular reducing the energy loss estimates.
Let me, however, leave these details for special topic courses and focus on a much more
important effect described by our formulas. Consider the dependence of the electric field components on
the impact parameter b, i.e. on the closest distance between the particle’s trajectory and the field
observation point. At b , we can use, in Eqs. (116)-(117), the asymptotic formula (2.158),
1/ 2
K n ( ) e , at , (10.127)
2
Chapter 10 Page 30 of 40
Essential Graduate Physics EM: Classical Electrodynamics
to conclude that if 2 > 0, i.e. if is real, the complex amplitudes E of both components Ex and Ey of
the electric field decrease with b exponentially. However, let us consider what happens at frequencies
where 2() < 0, 48 i.e.
1 1 1
( ) ( ) 2 2 2 0 0 . (10.128)
v ( ) u c
(This condition means that the particle’s velocity is larger than the phase velocity of the waves at this
particular frequency.) In this case, the parameter () is purely imaginary, so the functions exp{b} in
the asymptotes (127) of Eqs. (116)-(117) become just phase factors, and the field component amplitudes
fall very slowly:
1
E x ( ) E y ( ) 1 / 2 . (10.129)
b
This means that the Poynting vector drops as 1/b, so its flux through a surface of a round cylinder of
radius b, with its axis on the particle trajectory (i.e. the power flow from the particle), does not depend
on b at all. This is an electromagnetic wave emission – the famous Cherenkov radiation.49
The direction n of its propagation may be readily found taking into account that at large
distances from the particle’s trajectory, the emitted wave has to be locally planar and transverse (nE),
so the so-called Cherenkov angle between the vector n and the particle’s velocity u may be simply
found from the ratio of the electric field components – see Fig. 14a:
E
tan x . (10.130)
Ey
E Ey (a) (b)
y
v t
n
Ex 0 ut
u
0 x
Fig. 10.14. (a) The Cherenkov radiation’s propagation angle , and (b) its interpretation.
The ratio on the right-hand side of this relation may be calculated by plugging the asymptotic
formula (127) into Eqs. (116) and (117) and calculating their ratio:
48 Strictly speaking, the inequality 2() < 0 does not make sense for a medium with a complex product ()(),
and hence complex 2(). However, in a typical medium where particles can propagate over substantial distances,
the imaginary part of the product ()() does not vanish only in very limited frequency intervals, much more
narrow than the intervals that we are discussing now – please have one more look at Fig. 7.5.
49 This radiation was observed experimentally by Pavel Alekseevich Cherenkov (in older Western texts,
“Čerenkov”) in 1934, with the observations explained by Ilya Mikhailovich Frank and Igor Yevgenyevich Tamm
in 1937. Note, however, that the effect had been predicted theoretically as early as 1889 by the same Oliver
Heaviside whose name was mentioned in this course so many times – and whose genius I believe is still
underappreciated.
Chapter 10 Page 31 of 40
Essential Graduate Physics EM: Classical Electrodynamics
1/ 2
iu u2
E
tan x
( ) ( )u 2 1
1/ 2
2 1 , (10.131a)
Ey v ( )
so
v Cherenkov
cos 1. (10.131b) radiation:
u angle
Remarkably, this direction does not depend on the emission time tret, so the radiation of
frequency , at each instant, forms a hollow cone led by the particle. This simple result allows an
evident interpretation (Fig. 14b): the cone’s interior is just the set of all observation points that have
already been reached by the radiation, propagating with the speed v() < u, emitted from all previous
points of the particle’s trajectory by the given time t. This phenomenon is an analog of the so-called
Mach cone in fluid dynamics,50 besides that in the Cherenkov radiation, there is a separate cone for each
frequency (of the range in which v() < u): the smaller is the ()() product, i.e. the higher is the
wave velocity v() = 1/[()()]1/2, the broader is the cone, so the earlier the corresponding “shock
wave” arrives to an observer. Please note that the Cherenkov radiation is a unique radiative
phenomenon: it takes place even if a particle moves without acceleration, and (in agreement with our
analysis in Sec. 2), is impossible in free space, where v() = c = const is larger than u for any particle.
The Cherenkov radiation’s intensity may be also readily found by plugging the asymptotic
expression (127), with imaginary , into Eq. (123). The result is
v 2
2
dE Z e Cherenkov
dx 4 v ()u
1 2 d . (10.132) radiation:
u intensity
For non-relativistic particles (u << c), the Cherenkov radiation condition u > v() is fulfilled only in
relatively narrow frequency intervals where the product ()() is very large (usually, due to optical
resonance peaks of the electric permittivity – see Fig. 7.5 and its discussion). In this case, the emitted
light consists of a few nearly-monochromatic components. On the contrary, if the condition u > v(), i.e.
u2/()() > 1 is fulfilled in a broad frequency range, as it is for ultra-relativistic particles in
condensed media, then the radiated power, according to Eq. (132), is dominated by higher frequencies of
the range – hence the famous bluish color of the Cherenkov radiation glow from water-filled nuclear
reactors– see Fig. 15.
Chapter 10 Page 32 of 40
Essential Graduate Physics EM: Classical Electrodynamics
The Cherenkov radiation is broadly used in high-energy experiments for particle identification
and speed measurement (since it is easy to pass the particles through layers of different densities and
hence with different dielectric constants) – for example, in the so-called Ring Imaging Cherenkov
(RICH) detectors that have been designed for the DELPHI experiment51 at the Large Electron-Positron
Collider (LEP) in CERN.
A little bit counter-intuitively, the formalism described in this section is also very useful for the
description of an apparently rather different effect – the so-called transition radiation that takes place
when a charged particle crosses a border between two media.52 The effect may be interpreted as the
result of the time dependence of the electric dipole formed by the moving charge q and its mirror image
q’ in the counterpart medium – see Fig. 16.
q q'
In the non-relativistic limit, this effect allows a straightforward description combining the
electrostatics picture of Sec. 3.4 (see Fig. 3.9 and its discussion), and Eq. (8.27), corrected for the media
polarization effects. However, if the particle’s velocity u is comparable with the phase velocity of waves
in either medium, the adequate theory of the transition radiation becomes very close to that of the
Cherenkov radiation.
In comparison with the Cherenkov radiation, the transition radiation is rather weak, and its
practical use (mostly for the measurement of the Lorentz factor , to which the radiation intensity is
nearly proportional) requires multi-layered stacks.53 In these systems, the radiation emitted at sequential
borders may be coherent, and the system’s physics may become close to that of the free-electron lasers
mentioned in Sec. 4.
Chapter 10 Page 33 of 40
Essential Graduate Physics EM: Classical Electrodynamics
the back effects of the radiation implicitly, via the energy conservation arguments. However, even in
these cases, the near-field effects, such as the first term in Eq. (19), which affect the moving particle
most, have been ignored.
At the same time, it is clear that in sharp contrast with electrostatics, the interaction of a moving
point charge with its own field cannot be always ignored. As the simplest example, if an electron is
made to fly through a resonant cavity, thus inducing electromagnetic oscillations in it, and then is forced
(say, by an appropriate static field) to return into the cavity before the oscillations have decayed, its
motion will certainly be affected by the oscillating fields, just as if they had been induced by another
source. There is no conceptual problem with applying the Maxwell theory to such “field-particle
rendezvous” effects; moreover, it is the basis of the engineering design of such vacuum electron devices
as klystrons, magnetrons, and free-electron lasers.
A problem arises only when no clear “rendezvous” points are enforced by boundary conditions,
so the most important self-field effects are at R r – r’ 0, the most evident example being the
charged particle’s radiation into free space, described earlier in this chapter. We already know that such
radiation takes away a part of the charge’s kinetic energy, i.e. has to cause its deceleration. One should
wonder, however, whether such self-action effects might be described in a more direct, non-perturbative
way.
As the first attempt, let us try a phenomenological approach based on the already derived
formulas for the radiation power P. For the sake of simplicity, let us consider a non-relativistic point
charge q in free space, so P is described by Eq. (8.27), with the electric dipole moment’s derivative over
time equal to qu:
Z0q 2 2 2 q2 2
P u 3
u . (10.133)
6c 2 3c 4 0
The most naïve approach would be to write the equation of the particle’s motion in the form
mu Fext Fself , (10.134)
and try to calculate the radiation back-action force Fself by requiring its instant power, –Fselfu, to be
equal to P. However, this approach (say, for a 1D motion) would give a very unnatural result,
u 2
Fself , (10.135)
u
that might diverge at some points of the particle’s trajectory. This failure is clearly due to the retardation
effect: as the reader may recall, Eq. (133) results from the analysis of radiation fields in the far-field
zone, i.e. at large distances R from the particle, e.g., from the second term in Eq. (19), i.e. when the non-
radiative first term (which is much larger at small distances, R 0) is ignored.
Before exploring the effects of this term, let us, however, make one more attempt at Eq. (133),
considering its average effect on some periodic motion of the particle. (A possible argument for this
step is that at the periodic motion, the retardation effects should be averaged out – just as at the transfer
from Eq. (8.27) to Eq. (8.28).) To calculate the average, let us write the identity
T
1
u u u dt ,
2
(10.136)
T 0
Chapter 10 Page 34 of 40
Essential Graduate Physics EM: Classical Electrodynamics
and carry out the integration on the right-hand side of Eq. (133) by parts over the motion period T:
2 q 2 1
T T
2 q2 T 1 2 q2
P u 2
u
u
0
u
u dt 0 3c 3 4 0 u udt . (10.137)
3c 3 4 0 3c 3 4 0 T 0
T
Looking for the solution of this linear differential equation in the usual exponential form, x(t)
exp{t}, we get the following characteristic equation,
2 02 3 . (10.143)
It may look like that for any “reasonable” value of 0 << 1/ ~ 1023 s-1, the right-hand side of this
nonlinear algebraic equation may be treated as a perturbation. Indeed, looking for its solutions in the
54 Just for the reader’s reference, this formula may be readily generalized to the relativistic case, in the 4-form:
2 q 2 d 2 p p dp dp
Fself ,
3mc 3 4 0 d 2 (mc) 2 d d
the so-called Abraham-Lorentz-Dirac force.
Chapter 10 Page 35 of 40
Essential Graduate Physics EM: Classical Electrodynamics
natural form = i0 + ’, with ’ << 0, expanding both parts of Eq. (143) in the Taylor series in
the small parameter ’, and keeping only the terms linear in ’, we get
02
' . (10.144)
2
This means that the energy of free oscillations decreases in time as exp{2’t} = exp{-02 t}; this is
exactly the radiative damping analyzed earlier. However, Eq. (143) is deceiving; it has the third root
corresponding to unphysical, exponentially growing (so-called run-away) solutions. It is easiest to see
this for a free particle, with 0 = 0. Then Eq. (143) becomes very simple,
2 3 , (10.145)
and it is easy to find all its three roots explicitly: 1 = 2 = 0 and 3 = 1/. While the first two roots
correspond to the values found earlier, the last one describes an exponential (and extremely rapid!)
acceleration.
In order to remove this artifact, let us try to develop a self-consistent approach to the back-action
effects, taking into account the near-field terms of particle fields. For that, we need to somehow
overcome the divergence of Eqs. (10) and (19) at R 0. The most reasonable way to do this is to spread
the particle’s charge over a ball of radius a, with a spherically symmetric (but not necessarily constant)
density (r), and at the end of the calculations trace the limit a 0.55 Again sticking to the non-
relativistic case (so the magnetic component of the Lorentz force is not important), we should calculate
where the electric field is that of the charge itself, with the field of any elementary charge dq = (r)d3r
described by Eq. (19).
To enable an analytical calculation of the force, we need to make the assumption a << rc, treat
the ratio R/rc ~ a/rc as a small parameter, and expand the resulting right-hand side of Eq. (146) into the
Taylor series in small R. This procedure yields
2 1
(1) n d n 1u 3
Fself
3 4 0
n 0 c n2
n! dt n 1
d r d 3 r' (r ) R n 1 (r' ) . (10.147)
V V
55Note: this operation cannot be interpreted as describing a quantum spread due to the finite extent of the point
particle’s wavefunction. In quantum mechanics, different parts of the wavefunction of the same charged particle
do not interact with each other!
Chapter 10 Page 36 of 40
Essential Graduate Physics EM: Classical Electrodynamics
2 1 u (r ) (r' ) 4 u 1 1 3 (r ) (r' ) 4
2
F0 d 3 r d 3r' d r d 3r' 2 u U , (10.149)
3 4 0 c V V
R 3 c 4 0 2 V
2
V
R 3c
where U is the electrostatic energy (1.59) of the static charge’s self–interaction. This term may be
interpreted as the inertial “force” 56 (–mefa) with the following effective electromagnetic mass:
Electro- 4U
magnetic mef , (10.150)
mass 3 c2
which is a factor of 4/3 larger than it should be according to Einstein’s formula (9.73). This is the
famous (or rather infamous :-) 4/3 problem that does not allow one to interpret the electron’s mass as
that of its electric field. Some (admittedly, rather formal) resolution of this paradox is possible only in
quantum electrodynamics with its renormalization techniques – beyond the framework of this course.
Note that all these issues are only important for motions with frequencies of the order of 1/ ~
10 s , i.e. at energies as high as ~/ ~ 108 eV, while other quantum electrodynamics effects may be
23 –1
observed at much lower frequencies, starting from ~1010 s–1. Hence the 4/3 problem is by no means the
only or the most significant motivation for the transfer from classical to quantum electrodynamics.
However, the reader should not think that their time spent on this course has been lost: quantum
electrodynamics it heavily based on classical electrodynamics, incorporates virtually all its results, and
the basic transition between them is surprisingly straightforward.57 So, I look forward to welcoming the
reader to the next, quantum-mechanics part of this series.
10.1. Derive Eqs. (10) from Eqs. (1) by a direct (but careful!) integration.
10.2. Derive the radiation-related parts of Eqs. (19)-(20) from the Liénard-Wiechert potentials
(10) by direct differentiation.
10.4. Express the instantaneous power of electromagnetic radiation by a relativistic particle with
electric charge q and rest mass m, moving with velocity u, via the Lorentz force F providing its
acceleration.
10.5. A relativistic particle with rest mass m and electric charge q, initially at rest, is accelerated
by a constant force F until it reaches a certain velocity u and then is left to move by inertia. Calculate
the total energy radiated during the acceleration.
Chapter 10 Page 37 of 40
Essential Graduate Physics EM: Classical Electrodynamics
10.6. A charged relativistic particle with an initial momentum p0 flies ballistically from a free-
space region into a region of a constant, uniform electric field E, whose force is directed opposite to p0.
Calculate the energy radiated by the particle during its motion in the field, assuming that it is small in
comparison with the particle’s initial kinetic energy.
10.7. Calculate
(i) the instantaneous power, and
(ii)* the power spectrum
of the radiation emitted, into a unit solid angle, by a relativistic particle with charge q, performing 1D
harmonic oscillations with frequency 0 and displacement amplitude a.
10.8. Calculate and analyze the time dependence of the energy of a charged relativistic particle
rotating in a constant and uniform magnetic field B and, as a result, emitting the synchrotron radiation.
Qualitatively, what is the particle’s trajectory?
Hint: You may assume that the energy loss is relatively slow (–dE/dt << cE), but should spell
out the condition of validity of this assumption.
10.9. Analyze the polarization of the synchrotron radiation propagating within the particle’s
rotation plane.
10.10. Analyze the polarization and the spectral contents of the synchrotron radiation
propagating in the direction normal to the particle’s rotation plane. How do the results change if not one,
but N > 1 similar particles move around the circle, at equal angular distances?
10.11.* The basic quantum theory of radiation shows that the electric dipole radiation by a
particle is allowed only if the change of its angular momentum’s magnitude L at the transition is of the
order of Planck’s constant .
(i) Estimate the change of L of an ultra-relativistic particle due to its emission of a typical single
photon of the synchrotron radiation.
(ii) Do you think quantum mechanics forbid such radiation? If not, why?
10.12. A relativistic particle moves along the z-axis, with velocity uz, through an undulator – a
system of permanent magnets providing (in the simplest model) a perpendicular magnetic field, whose
distribution near the axis is sinusoidal:58
B n y B0 cos k 0 z .
Assuming that the field is so weak that it causes negligible deviations of the particle’s trajectory from
the straight line, calculate the angular distribution of the resulting radiation. What condition does the
above assumption impose on the system’s parameters?
58 As the Maxwell equation for H shows, this field distribution cannot be created in any non-zero volume of
free space. However, it may be created on a line – e.g., on the particle’s trajectory.
Chapter 10 Page 38 of 40
Essential Graduate Physics EM: Classical Electrodynamics
10.13. Discuss possible effects of the interference of the undulator radiation from different
periods of its static field distribution. In particular, calculate the angular positions of the power density
maxima.
10.14. An electron launched directly toward a plane surface of a perfect conductor is instantly
absorbed by it at the impact. Calculate the angular distribution and the frequency spectrum of the
electromagnetic waves radiated at this event, provided that the initial kinetic energy T of the particle is
much larger than the conductor’s workfunction .59 Is your result valid near the conductor’s surface?
10.15. A relativistic particle, with a rest mass m and an electric charge q, flies ballistically, with
velocity u, by an immobile point charge q’, with an impact parameter b so large that the deviations of its
trajectory from the straight line are negligible. Calculate the total energy loss due to the electromagnetic
radiation during the passage. Quantify the conditions of validity of your result.
Chapter 10 Page 39 of 40
Essential Graduate Physics EM: Classical Electrodynamics
This page is
intentionally left
blank
Chapter 10 Page 40 of 40