High-Energy Astrophysics - A Primer
High-Energy Astrophysics - A Primer
High-Energy
Astrophysics
A Primer
Undergraduate Lecture Notes in Physics
Series Editors
Neil Ashby, University of Colorado, Boulder, CO, USA
William Brantley, Department of Physics, Furman University, Greenville, SC, USA
Michael Fowler, Department of Physics, University of Virginia, Charlottesville,
VA, USA
Morten Hjorth-Jensen, Department of Physics, University of Oslo, Oslo, Norway
Michael Inglis, Department of Physical Sciences, SUNY Suffolk County
Community College, Selden, NY, USA
Barry Luokkala , Department of Physics, Carnegie Mellon University, Pittsburgh,
PA, USA
Undergraduate Lecture Notes in Physics (ULNP) publishes authoritative texts
covering topics throughout pure and applied physics. Each title in the series is
suitable as a basis for undergraduate instruction, typically containing practice
problems, worked examples, chapter summaries, and suggestions for further reading.
ULNP titles must provide at least one of the following:
• An exceptionally clear and concise treatment of a standard undergraduate
subject.
• A solid undergraduate-level introduction to a graduate, advanced, or
non-standard subject.
• A novel perspective or an unusual approach to teaching a subject.
High-Energy Astrophysics
A Primer
Jorge Ernesto Horvath
IAG-USP, Astronomy Department
Universidade de São Paulo
São Paulo SP, Brazil
Translation from the Portuguese language edition: Astrofísica de Altas Energias: Uma Première by Jorge
Ernesto Horvath, © EDUSP 2020. Published by Editora da Universidade de Sao Paulo, Brazil. All Rights
Reserved.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Para las tres ovejas
Acknowledgements
My thanks to all the colleagues and students who have taught me physics
and astrophysics over all these years, including Omar Benvenuto, José Antonio
de Freitas Pacheco, Héctor Vucetich, Germán Lugones, Heman Mosquera Cuesta,
Bete Dal Pino, Marcelo Allen, César Zen Vasconcellos, Adam Burrows, Márcio
Catelan, Ignazio Bombaci, Rachid Ouyed, Ren-Xin Xu, Marco Limongi, S. O.
Kepler, Mariano Méndez, F. C. Michel, Todd Thompson, Manuel Malheiro, Sérgio
Barbosa Duarte, Genna Bisnovatyi-Kogan, Roberto Dell’Aglio Costa, Marcos Díaz,
María Alejandra De Vito, Rodolfo Valentim, David Blaschke, Mark Alford, Paulo
Sérgio Custódio, Dinah Moreira Allen, Thais Idiart, Gustavo Medina Tanco, Laura
Paulucci, Márcio de Avellar, Rodrigo de Souza, Eduardo Janot Pacheco, Aurora
Pérez Martínez, Daryel Manreza, Efrain Ferrer, Vivian de la Incera, J. C. N de Araújo,
H. Stöcker, and W. Maciel among many others. I would also like to thank Antonio
Lucas Bernardo and Lívia Silva Rocha, who served as teaching assistants for the AGA
315 High-Energy Astrophysics course and brought much to its success. Finally, the
students of AGA 315 contributed to the development of this text in its final form.
The Dean of Undergraduate Studies at USP and EDUSP are acknowledged for their
patronage and production of the Portuguese version of this work. My daughter Katia
Horvath is acknowledged for her dedication in correcting the uncountable number of
typos and mistakes in the text, while those remaining are my responsibility entirely.
The author also wishes to acknowledge the attention and assistance of Dr. Angela
Lahee, Stephen Lyle, and the production staff at Springer for the completion of the
present edition.
And yet, I still dream the dogmatic slumber of Immanuel Kant.
vii
Contents
ix
x Contents
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Chapter 1
The Nature of the Physical World:
Elementary Particles and Interactions
The idea that Nature is composed of discrete “packets” which combine to form the
entire visible Universe originated in the classical Greek world. This idea of the ele-
mentary “granularity” of the physical world was implicit in Pythagoras’ philosophy
and his school in Crotone (now in Italy and then a part of the Magna Graecia) more
than five centuries before the Christian era. The Pythagoreans attributed great impor-
tance to the discovery of the existence of a simple relationship between the tones of
a string (whole numbers) and similar questions, thus foreseeing that the world was
discrete, and formulated the powerful notion that reality is ultimately mathematical
in nature. The modern version of elementary blocks (particles) and their interactions
is presented in this Chapter.
Much more forcefully (although motivated by the logical solution to the problem
of the illusory movement that had posed Parmenides, and not by any experimental
evidence), the atomists Leucippus and Democritus formulated a theory regarding
the nature of matter, where discrete units (ατ oμoζ, atom) moved in the absence of
matter (vacuum), combining to produce the entire visible Universe. Atoms would
differentiate themselves by their geometry (such as the difference between the figures
“A” and “N”), by their disposition or order (such as the differences between “NA”
or “AN”), and by their position (such as “N” is a rotated “Z”). Different combina-
tions and proportions would be responsible for the diversity of bodies. This strongly
materialistic doctrine (for example, for the atomists even the soul was made up of
atoms) has never been fully accepted in general, and Aristotle and later philoso-
phers raised objections against the atomic idea, which was almost totally forgotten.
However, for many centuries this whole discussion did not go beyond the realm
of ideas, since the technological development and methodological attitude of the
Greeks never made a direct inquiry to Nature, and indeed the notion of experimen-
tal proof did not appear at all in the ancient world until at least the Low Middle
Ages.
In spite of being far from accepted at the time of Leucippus and Democritus,
versions of the atomistic world survived in Lucretius, and were quite influential in
the exposition of the Epicurean version of the atomistic doctrine. His great work De
Rerum Natura [1] even contains a hint of the idea of inertia applied to the motion of
atoms, among other insights. However, the Aristotelian view of the world dominated
the scene for almost two millennia, and little or no room was left for atomism. A first
explicit break with Aristotle was due to the French clergyman Pierre Gassendi in the
mid-17th century, resuming in many senses matters that had been addressed by the
atomistic program [2]. Newton and others were able to formulate several ideas about
the physical world that were reminiscent of the Greek Epicurean atomism.
Finally, in the 18th century atomic theory made a significant comeback, sup-
ported by the empirical work of John Dalton (1766–1844) and his followers. Dalton
realized that the known chemical reactions were compatible with discrete packages
and redefined the concept of the Greek atom for these, whereupon atoms began to
have a tangible physical reality. Besides being firmly based on laws of conserva-
tion (for example, of mass), Dalton formulated the idea that a chemical reaction
is basically a rearrangement of atoms. The complex substances are, in this vision,
composed of atoms, in a very close parallel to the ideas of the Greek atomists.
After a long debate and a lot of experimentation and argument, the “modern” atomic
theory finally constituted a physical–chemical paradigm, anchored in the “new atom-
ism” of Newton, Boltzmann, and others, for which Dalton’s insistence and empirical
evidence-gathering were truly fundamental.
Nevertheless, it took almost another century for atoms to begin to show their
true nature. In fact, in the original version, and even by the etymology itself, atoms
remained indivisible. However, the first important experimental milestone confirma-
tion of the works of Faraday and others on electromagnetic theory, together with the
ideas of Boltzmann and Gibbs who sought to ground Thermodynamics in micro-
physics, came with the discovery of the electron by J.J. Thompson in 1897. Shortly
afterwards, Rutherford conducted a series of experiments that revealed the existence
of the atomic nucleus; and in the early 20th century, the discovery and interpretation
of radioactivity in terms of the structure of the nucleus led to the identification of the
proton and neutron as the basic ingredients of Rutherford’s nucleus.
With these discoveries began an era of characterization of the particles that con-
stitute atoms (that is, there was a change to a deeper level of elementarity) and
construction of detailed atomic models. In fact, one of the first models (the Thomp-
son model) postulated a positively charged (continuous) fluid in which the electrons
were embedded, that is, the amount of charge assigned to each component was not
defined. It was up to Robert Millikan to contribute to the problem shortly afterwards
with his experiments, which demonstrated the discrete character (quantization) of
the electric charge of the electron.
From both the theoretical and experimental points of view, the concept of ele-
mentarity can be considered in a relative sense: a particle can be seen as elementary
(without internal structure) at low energies, but can reveal itself to be composed of
1.1 Elementary Particles and Fundamental Interactions: An Overview 3
Fig. 1.1 The structure of matter. As higher energies are attained, matter reveals its most elementary
components at the smallest scales. So far there is no evidence of any substructure for quarks or
electrons, and nothing has yet been detected even at distances ∼10−15 cm, although some theoretical
proposals have been formulated, such as those indicated in the figure with a question mark
x × p ≥ , (1.1)
which are known as the uncertainty relations (although a more exact translation of the
original German word used by Heisenberg would be “indeterminability”). Similarly
the energy E measured at a time t should satisfy
E × t ≥ , (1.2)
although this last inequality has roots in Classical Physics, since it does not stem
from the algebra of operators like (1.1).
Using such uncertainty relations, we are now in a position to elaborate some of
the basic ideas that we will use later in the course. The first concerns the possibility
of violating energy conservation in a process, although only for a very short time.
In fact, in the quantum vacuum, (1.2) allows particle–antiparticle pairs to “pop up”
spontaneously and quickly annihilate again (Fig. 1.2), as long as the time elapsed is
short enough. This phenomenon is known as vacuum fluctuation, and is unique to
quantum theory. The particles involved are said to be virtual particles.
A particle of this type with energy E and mass m can be exchanged for two “real”
particles up to distances L ≈ ct (the speed of light c is the maximum speed of the
L ≈ ct = , (1.3)
mc
where we have used the famous relation E = mc2 . The quantity λC = /mc with
length dimensions on the right-hand side is called the Compton wavelength, and
allows a simple interpretation of the physical situation: the intermediate particle of
mass m is located in a region of linear dimension roughly the order of λC where the
probability of finding it is not zero. Thus, the Compton wavelength defines a range
for the interaction mediated by a particle of this mass. When we consider particles of
higher mass, λC decreases and the interaction has a shorter range. A classical analogy
for this behavior is presented in Fig. 1.3.
Although the detailed calculation of each elementary process is complicated,
requiring knowledge of the quantum theory of fields, some order-of-magnitude esti-
mates can be obtained using simple dimensional analysis. The modern view of funda-
mental interactions is that these are essentially virtual intermediate particle exchanges
between two or more particles that carry some form of charge (for example, the elec-
tron with an electric charge).
The long range of the electromagnetic interaction suggests that the mediating
particle has no mass, and that it is the quantum of “light”, i.e., the photon. The charge
associated with the interaction in this case is just the electric charge, and the strength
of the interaction is measured in terms of some dimensionless number, which in the
case of Electromagnetism is the fine structure constant α = 1/137. This coupling
constant is small and indicates that the electromagnetic interaction is much weaker
than the strong interactions that bind the nucleus, for which the coupling constant
has the value ≈1 in appropriate units. As we have seen, besides the exchange of
the mediating particle, fluctuations can occur thanks to the uncertainty relations,
in which a particle–antiparticle pair or more complex entities can appear and get
reabsorbed in very short times. Representing the electrons (or any other charged
particle) by a straight line and photons by wavy lines, we can draw “hieroglyphs”
that represent what happens diagrammatically in each possible case (Fig. 1.4). These
symbols correspond to definite mathematical expressions in each case, but we will
1.2 Elementary Interactions at High Energies 7
Fig. 1.4 Basic electromagnetic interaction between two electrons as a sum of quantum processes
which are progressively less important
not discuss them here because we are only interested in the conceptual aspect of
these calculations.
The result is an infinite sum of such diagrams that takes into account all the
complexity of the micro-world, and from which the known classical limit emerges
under certain conditions. In Fig. 1.4 the diagram on the left should be considered as
including all the interactions plotted on the right, and is called the “dressed” diagram.
On the right, we see that the first and simplest diagram does not contain vacuum
contributions; these appear only in the next order in powers of the coupling constant
α. Thus, the classical theory of the electromagnetic field due to Faraday and Maxwell
corresponds to the first (classical) diagram, referred to in the jargon as the tree level
(no fluctuations included). The quantum corrections appear at order α and higher
powers of α, whence the importance of having α < 1: although the series cannot be
summed in general (in fact, it is not a convergent series in the mathematical sense, but
is nevertheless increasingly accurate as more terms are added!), the terms are ever
smaller as the power of α increases, so the result can be found as accurately as one
requires by calculating the contributions up to a given order. Taking the static limit of
the series and staying only with the tree level, we recover the classical Coulomb and
Yukawa potentials, viz., VC (r ) ∝ −α/r and VYuk (r ) ∝ exp(−r/L), respectively, in
the case where the mediator has zero mass or mass m [see (1.3)].
The construction of a coherent picture of the physical Universe needed the recogni-
tion of the existence of four elementary interactions: the electromagnetic interaction
already mentioned, gravitation (still without a proper quantum theory), the strong
interaction (responsible for nuclear binding), and the weak interaction (responsible
for beta decay and other processes involving neutrinos). The same particle can have
more than one “charge” and thus suffer several of these interactions. This is the case,
for example, for the quarks inside a proton, which have electric charge, but also weak
charge and strong charge (called “color”). Thus, they are capable of interacting by
exchanging photons, gluons, and massive bosons. A summary of the fundamental
interactions and their main characteristics is shown in Table 1.1.
Obviously, it is not always the best strategy to try to understand a certain physical
problem directly in terms of the four fundamental interactions, since the problem can
sometimes get very complicated. Thus, we are again led to consider the absolute or
relative concept of elementarity, depending on the physical conditions involved and
the need for detail in the description, as already pointed out. As a concrete example of
8 1 The Nature of the Physical World: Elementary Particles and Interactions
this situation, the strong interactions (those that hold nuclei together) can be described
as mediated by pions and mesons, as in the 1950s and 60s, as long as the energy
considered is low (these are considered “effective” descriptions in an ample sense).
If we increase the energy, the protons and neutrons and also the mediating particles
will reveal their composite nature and that simple description may be insufficient.
Note once again that this is not a statement about the absolute elementarity of the
electron, quarks, or other particles, but rather one about the relative elementarity for
practical purposes.
Based on the previous ideas we can now discuss the known “zoo” of particles, which
make up the so-called Standard Model of particle Physics, and classify the elementary
interactions. The world of subatomic particles has expanded vertiginously since the
early 20th century. In the first decades of that century, only the electron and the
proton were known. The discovery of the neutron and soon after antiparticles brought
considerable perplexity and great challenges for physicists studying the structure of
matter. The first particle accelerators idealized and built by E. Lawrence in the United
States gave a further stimulus to the Physics of elementary particles, boosting the
discovery of new particles by increasing the energies involved in collisions.
In addition to modelling the dynamics of interactions between these particles, a
classification scheme was also required. After several attempts of historical interest,
but whose complexity would take us too far from the scope of the present book, there is
a consensus today around the scheme that became known as the Standard Model. This
classifies elementary particles into three groups or generations, depending on their
participation in the elementary processes that have been detected, i.e., the reactions in
which the particles take part. The composition of a generation is always the same: it
contains two quarks (which constitute the baryons and mesons), a charged lepton (the
electron, the muon, and the tau, successively), and a neutrino associated with the latter
(a different neutrino for each type of lepton). The discovery and identification of these
particles, and the recognition of the symmetries implemented over the generations,
took several decades and was only completed with the discovery of the quark t in 1995
and the Higgs boson (responsible for the observed masses) in 2012. At the present
time there is no evidence to indicate any important departures from the Standard
1.3 Standard Model of Elementary Particles 9
Fig. 1.5 Contents of the Standard Model. The figure shows the three generations with the two
quarks (up and down in the first, charm and strange in the second, and top and bottom in the third,
all fantasy names devised to make them easy to remember. Also shown are the charged leptons e,
μ, and τ and the three corresponding neutrinos νe , νμ , and ντ . Electric charges and masses of the
particles are indicated above and below each particle. See [5] for more detail
Model data. Figure 1.5 illustrates our current knowledge of the particle structure of
the Standard Model.
In the early twentieth century, the recognition of the need for a new force to hold the
atomic nucleus together led to the introduction of the Yukawa potential, as already
mentioned, and to the prediction of the existence of the pion, discovered soon after as
a component of cosmic rays and in dedicated experiments carried out by the physicist
César Lattes and collaborators at the University of São Paulo in 1947 (see his account
in [6]). This exemplifies the idea that interactions are the result of the exchange of
mediating particles. Later on in nuclear Physics it became clear that the pion was
only one such mediating particle. The interactions between nucleons (protons and
neutrons) also involve the exchange of kaons, ρ mesons, and other mediators, giving
rise in the static limit to the so-called Yukawa potential and corrections presented
above.
For some decades this picture was satisfactory, but accelerator experiments even-
tually showed that protons and neutrons were far from being pointlike: incident
electrons striking these nuclei scattered as if they encountered “hard” points on
scales ≤10−14 cm. Thus, Gell-Mann and Zweig were led to suggest that there are
fundamental constituents of the nuclei, which they called quarks. A highly non-linear
field theory called quantum chromodynamics (QCD) was soon developed, in which
quarks exchange gluons, the mediating particle of strong interactions. The associated
charge comes in three types and was fancifully called color (although it has nothing
to do with real colors, of course). However, this theory has a characteristic that really
sets it apart: despite intensive searches it has never been possible to detect an isolated
quark outside a hadron (hadrons are particles participating in strong interactions, i.e.,
10 1 The Nature of the Physical World: Elementary Particles and Interactions
Fig. 1.6 Color confinement. A meson formed by a quark Q and an antiquark Q̄ receives enough
energy to break up, but the gluon “string” uses this energy to create a quark q and an antiquark q̄,
and thus two “white” mesons are formed, in which the colors neutralize one another, thus keeping
the total color invisible to an external observer
baryons, such as the nucleons, and mesons). This gave rise to a totally new idea, that
of the confinement of color, according to which the colors of the quarks always com-
bine (just like the primary colors) to produce a “white” hadron, that is, without color.
Each time a quark is ripped out of a hadron, this breaks the flow tube that connects
it with another, and thus two mesons are produced (Fig. 1.6).
It is currently believed that this property of quarks (and gluons) is contained in the
theoretical description, since there are numerical simulations that demonstrate con-
finement. But there is also another peculiarity of the theory: at very short distances,
inside the hadrons, the quarks and gluons seem to be free, that is, they do not “feel”
the interactions between them. In fact, we can define these short distances or long
distances by using the relativistic definition of the relation between momentum and
energy, i.e., E = pc, of the incident particle. From the uncertainty relation (1.1), we
have immediately that the distances reached by the projectile particle are inversely
proportional to its energy, x ∼ /E. The “long” distances can be considered as those
greater than the radius of the proton, while the small ones are much smaller than this
radius. The behavior of the quarks in the first case is called infrared slavery (low
incident energies) and in the second asymptotic freedom (high incident energies).
This behavior can be simulated by the phenomenological potential
V (r ) = −αS /r + kr .
At short distances the first, the attractive term dominates, but if we consider large
distances the potential grows and it will be impossible to extract a quark. Another
widely used approach, which we will describe below, draws a physically reasonable
picture and is simple to calculate, whence it has been widely used. It is known as the
MIT bag model (see Fig. 1.7).
To simulate the effects of confinement, the model admits that the true vacuum is a
perfect dielectric for the color charge, and particles that carry color do not penetrate
it. Thus, the model proposes to consider a “perturbative vacuum” cavity in the true
vacuum where quarks and gluons can live. Energy is required to create this cavity
1.4 Strong Interactions and Quantum Chromodynamics (QCD) 11
4 2.04N
E tot = E vac + E k = π R3 B + . (1.4)
3 R
The equilibrium configuration must be a minimum of the total energy, found by
varying it against the radius R :
∂ E tot 2.04N
=− + 4π R 2 B = 0 , (1.5)
∂R R2
a condition that determines the cavity radius to be
1/4
2.04N 1
R= . (1.6)
4π B 1/4
Fig. 1.8 Basic phase diagram of QCD. The QGP sits in the upper and right regions, accessible in
heavy-ion collisions (arrow trajectories), the early Universe (along the vertical axis), and neutron
stars (cold quark matter), as indicated. Courtesy of Mark Alford and Alan Stonebraker, Washington
University
Many textbooks begin with a discussion of the classical gravitational force between
two macroscopic masses m 1 and m 2
m1m2
FG = −G N , (1.7)
r2
where G N is Newton’s gravitational constant and r the distance between the masses.
With a view to formulating the description of gravitation as an elementary interaction,
where the particles exchange a “graviton” (Table 1.1) and from which the law of
Newtonian gravitation should result, dimensional analysis of Newton’s equation
shows that G N is not dimensionless, a fundamental requirement for the construction
of an elementary theory. That is why a “gravitational fine structure constant” is
commonly defined by αG = G N m 2p /c on an energy scale equal to the mass of the
proton (the quantity that appears in Table 1.1). But by the tiny numerical value of
αG ∼ 10−38 , gravitation can almost always be ignored compared to the other forces
of Nature, at least as long as we talk about elementary processes. The question that
arises is: why is it then that gravitation dominates the structure of the observable
Universe, stars, and galaxies? The simplest answer is to be found in the unique
nature of the “charge” of the gravitational field, which is just mass: if macroscopic
sets of particles are considered, gravitation “accumulates” until the structure itself
is dominated by it, while the other forces cancel each other out as we consider more
and more particles. Let us consider quantitatively N particles of equal mass. The
radius of a sphere formed by this set of particles depends on N 1/3 , while the energy
of the gravitational bond is proportional to N 2/3 . To compensate for the smallness
of the factor of 10−38 of the constant αG , the number of particles required must be
N = 1038×(3/2) = 1057 . This is approximately the number of particles (protons) in a
star like our own, with mass denoted by M , and results in the “natural” scale where
gravitation becomes more important than the other forces at a macroscopic scale (in
fact we know that the Sun, for example, does not have a large contribution to its
binding energy from strong, weak, and electromagnetic interactions) [7].
This discussion leads to the conclusion that we can neglect gravitation in micro-
scopic systems, unless the energy scale grows as much as to make αG ≈ 1. Under
these conditions, microscopic gravitation would be as important as the other funda-
mental interactions. The mass where this equivalence occurs is
1/2
c
m Pl = , (1.8)
GN
14 1 The Nature of the Physical World: Elementary Particles and Interactions
In the 19th century, thanks to contributions from Maxwell, Faraday, and others, Elec-
tromagnetism was established as a theoretical paradigm for the study of phenomena
involving electric charges in the laboratory. The discovery of the electron by J.J.
Thompson in 1897 (the quantum of electric charge par excellence) provided a way
to “penetrate” the atom by throwing electrons at it, and later to discover the atomic
nucleus using helium nuclei (also electrically charged) as projectiles. The observa-
tion of the behavior and composition of atomic nuclei then opened an important
window in the study of elementary particles.
By the 1920s, the proton had been identified as a component of the Rutherford
nucleus. A series of experiments showed that, under certain circumstances, a nucleus
could change its state of charge, with the expulsion of an electron from the nucleus.
Thus, there were two possibilities: either the atomic nucleus contained electrons,
or they were emitted by a particle decaying into a proton and an electron. This last
hypothesis received definitive confirmation when Chadwick discovered the neutron
in 1931. It was found that neutrons could spontaneously convert into protons, either
when free or within the nucleus, whence Nature could change the type of nucleon
that constituted the nucleus under certain conditions.
It also became clear that the observed conversion was not of electromagnetic
origin (although the electric charge was conserved). Physicists thus sought the origin
1.6 Role of Weak Interactions 15
and nature of the force responsible. In the first place, it had to be a short-range force
because the reaction takes place mainly on scales of the order of the atomic nucleus.
The characterization of the strength of this force also emerged from the data, and
turned out to be several orders of magnitude weaker than the electromagnetic force
(see Table 1.1). Thus, the discovery of weak forces associated neutron decay with a
new fundamental interaction:
n → p + e− + ν̄e , (1.9)
where the neutron and proton were still part of the nucleus, and the electron escaped
from the nuclear region. The last protagonist here, in fact an anti-neutrino, was not
observed at first, but was postulated by W. Pauli to solve two serious problems with
this decay: the conservation of energy and the conservation of angular momentum
in the reaction. In fact, in spontaneous decay, such as was observed for neutrons
within nuclei, the total angular momentum did not seem to be conserved, since the
neutron spin (1/2) was equal to half the spin of the particles observed in the reaction
products, a proton of spin 1/2 and an electron of spin 1/2. Moreover, the sum of the
energies of the particles taking part in the reaction was not constant. Nobody liked
to abandon the conservation of energy and the angular momentum in Physics, and
this is what inspired Pauli’s creative solution to this problem.
In fact, he postulated a neutral particle that had to be very light or of zero mass,
with the necessary spin (1/2), to restore the conservation of both quantities. Basically,
one could describe this hypothesis as the emission of a spin quantum. The important
thing was to restore the conservation laws.
The elementary theoretical description of decay (corresponding to the simplest
theory) was formulated by Fermi from 1933, and can be visualized in terms of Feyn-
man diagrams. The basic diagram is the one shown in Fig. 1.9, which corresponds
to the reaction in (1.9). The formal mathematical description incorporated a new
term that led to the violation of parity, that is, the invariance of the processes under
a change of coordinates (x, y, z) → (−x, −y, −z). The discovery that some weak
interactions are not the same when “viewed in the mirror” (such is the meaning of
the above sign change transformation) was confirmed experimentally in the 1950s
by observing specific particle decays, and it led to the 1957 Nobel Prize for T.D. Lee
and C.N. Yang, although it was C.S. Wu who conducted the crucial experiment that
positively demonstrated this effect [5].
The reverse decay process p + e− → n + νe , which involves emission of a neu-
trino, happens under physical conditions in which electrons can be captured by
protons with energy gain (for example, near the end of the life of a star, see Chap. 4).
Fermi’s theory is oversimplified, but sufficient for low energies, and it is still used
today in these cases. When gauge theories emerged, with the theoretical work of
the 1960s and after, it became clear to physicists that Fermi’s theory was actually a
simplified version of one of them. Remembering that, for low energies, the spatial
and energy “resolution” of the experiment is not sufficient (or equivalently that the
mediator is much more massive than the energy of the measurement), we conclude
that the decay shown in Fig. 1.9 can be thought of effectively as a diagram where, in
16 1 The Nature of the Physical World: Elementary Particles and Interactions
the central “point”, a mediator (W± or Z0 in Table 1.1) is emitted and then decays
into the final pair. With this idea, we see that the emission and reabsorption vertices
coincide at this central “point”, whence Fermi’s theory works at very low energies.
However, it should be remembered that it is an approximation. The mass of the Z0
boson determined at CERN in the 1980s is approximately 90 GeV, and this value jus-
tifies a posteriori the original data regarding the range of the weak force [see Fig. 1.3
and (1.3)]. Using the uncertainty relation E × t ≥ and inserting the value of
the mass (energy) of the Z0 , we obtain a limit for the average lifetime of 3 × 10−25 s,
which corresponds to a maximum range for the interaction of t × c = 10−14 cm,
i.e., less than the radius of a nucleon.
The progress of research in the 1970–80s led to the consideration of a variety of
weak reactions of astrophysical (and cosmological) interest. However, in all of them,
the tiny value of the cross-section, a direct consequence of the small value of the Fermi
constant in Table 1.1, implies that high temperatures and/or densities are required for
the weak interactions to be important for large particle sets. Above a certain scale,
Fermi’s theory is no longer valid and needs to be replaced by the corresponding
expressions of the Salam–Weinberg model, the gauge theory developed for these
purposes, in which symmetry breaking plays a fundamental role in obtaining the
trio of massive bosons (W± , Z0 ) that lead to weak interactions, while the photon γ
remains massless, as it should to mediate the electromagnetic interactions.
We have already said that, since their discovery, weak interactions have posed
serious problems to physicists. For example, Pauli’s introduction of the neutrino was
a bold hypothesis, but in the end proved to be correct. Due to its “ghostly” nature,
it was very difficult to study and characterize the neutrino experimentally. Spin and
momentum (energy) are their only two characteristics, and there are only two dynamic
possibilities: either the spin s is opposed to the direction of the momentum k, or it
is in the same direction (Fig. 1.10). These two cases correspond to the particle and
antiparticle, respectively, since they cannot convert one into the other in the absence
of mass. Thus, in the first case, the particle was called a neutrino and in the second,
an anti-neutrino, both names being due to Enrico Fermi, who used the diminutives
of “neutron” in Italian when he coined the terms.
1.6 Role of Weak Interactions 17
Fig. 1.10 Neutrinos and antineutrinos. If their mass is zero, neutrinos and antineutrinos cannot
be confused, since the former always have their spin in the opposite direction to their momentum
(right), while the latter always have their spin in the same direction as their momentum (left).
Oscillations between the two types are possible when there is a (small) mass, and this has been
identified as the cause of the solar neutrino problem, discussed in Chap. 9
It is important to point out that the relative direction of the spin and momentum
is an affirmation of “absolute” character, since the projection (called the helicity)
is invariant under conjugation of charge followed by parity inversion (a CP trans-
formation). A zero-mass neutrino is described by a two-component function called
a spinor, but this is not so for finite mass particles. We still do not know whether
massive neutrinos require two or four components for their description (although it
is agreed that their mass is not zero, because of the consistent detection of fewer solar
neutrinos than expected over the years). We will see in Chap. 9 how it has been pos-
sible to develop neutrino Astrophysics from a basic knowledge of the cross-sections
and source fluxes, while the questions of mass now revealed by oscillations still holds
the attention of the community today.
References
1. Lucretius, Lucretius the Way Things Are: De Rerum Natura, translated by Rolfe Humphries
(Indiana University Press, Indianapolis, 1968)
2. P. Gassendi, Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/gassendi/
3. D. Bohm, Wholeness and the Implicate Order, 1st edn. (Routledge, New York, 2002)
4. M. Bunge, Philosophy of Physics (Dordrecht-Holland, Boston, 1973)
5. T.D. Lee, Symmetries, Asymmetries, and the World of Particles (Jessie & John Danz Lectures)
(University Washington Press, Washington, 1987)
6. C.M.G. Lattes, My work in meson Physics with nuclear emulsions, in 1st International Sym-
posium on the History of Particle Physics, eds. J. Bellandi Filho and A. Pemmaraju. Topics on
Cosmic Rays, 1, pp. 1–5 (1981)
7. A.S. Burrows, J.P. Ostriker, Astronomical reach of fundamental Physics. Proc. Natl. Academy
Sci. 111, 2409 (2014)
Chapter 2
Elementary Processes at High Energies
The long and fascinating history of the study of light and electromagnetic
phenomena (unified through Maxwell’s equations) went through several stages
before we reached the contemporary view. Without going into the details of ear-
lier thought, an important founding contribution at the beginning of the 18th century
was the publication of Newton’s ideas in his book Opticks (1704), where Newton
challenged the accepted view of the nature of light which went back to Aristotle’s
time, thus laying the foundations for extensive further debate. Newton defended a
mechanistic framework, arguing that light was composed of material corpuscles, bas-
ing his scientific deductions on a series of experiments, including his famous example
of the chromatic decomposition of light when it passes through a prism. Elementary
scattering, absorption and emission processes involving photons are presented with
an eye for their application in High-Energy Astrophysics.
This work had great impact, and led the way to important developments. In fact,
almost a century after Newton’s publication, Young and Fresnel carried out some
crucial experiments (for example, the double slit setup), and somehow combined
Newton’s ideas with Huygens’ wave description. These works are considered by
many to be the birth of modern Optics. It should be pointed out that this wave–
corpuscle duality occurred in the theory of light, but it took another century before
the works of Kirchhoff, Rayleigh, Jeans, and others on the emission and absorption of
light led to the (apparent) dead end that P. Ehrenfest called the ultraviolet catastrophe,
and which motivated a genuine revolution in Physics, where this problem of the nature
of light resurfaced with strength [1].
The problem under consideration was the so-called black body radiation, that is,
the study of the light emitted by a body heated to a temperature T . Physicists were
interested in the distribution of energy and in the dependence of the radiated flux
on temperature, since the experiments showed that the composition of the body was
irrelevant, i.e., bodies with different compositions radiated in the same way if heated
to the same T temperature. With the idea of calculating these quantities, physicists
considered the energy density of a radiant body, ε(ω), as a function of the frequency
ω of radiated light (corresponding to the Huygens–Fresnel waves). Analysis of the
classical problem of the frequencies of the waves inside a cavity that contains the
radiant body indicates that the energy should reach what is known as equipartition,
that is, a situation in which each frequency or mode has an energy E = kB T /2, where
kB is the Boltzmann constant. Thus, multiplying this energy by the density of modes
between ω and ω + dω, we would have
ε(ω) ∝ ω2 kB T dω , (2.1)
implying that the energy would grow without limit for high frequencies, which is
physically impossible because the amount of energy radiated cannot be infinite. This
inconsistency (or “catastrophe”) pointed to some error in the basic hypotheses that
needed to be clarified and corrected.
The interesting twist here is that the decisive idea to find a physical solution
had already been expressed by Max Planck. For very different reasons, Planck had
considered that the absorption and emission of radiation would happen discretely, in
“packages” (or quanta) that satisfy the following relationship between energy and
frequency [2]:
E = hν . (2.2)
When applied to the black body problem, this expression leads to a distribution that
does not diverge for large ω. In terms of the frequency ν = ω/2π , this distribution
is
8π ν 2 hν
B(ν, T ) = 3 , (2.3)
c hν
exp
kB T − 1
which is very different from the problematic “classical” form (8π ν 2 /c3 )kB T .
The solution that avoids this “catastrophe” was thus implicit in Planck’s hypothe-
sis, even before it was calculated. Planck had worked on the notion of discrete pack-
ages without really believing in its physical reality, and had never been convinced
of their factual existence right up until his death. It was Albert Einstein himself
who elevated Planck’s hypothesis of discrete quanta to the category of real physical
objects, and hence very different from a mere mathematical trick [2]. Einstein was
thus the “father” of quantum theory (although he never liked the consequences of
the probabilistic interpretation that Bohr, Born, and others later developed) giving
rise to the quantum of light, the photon, which satisfies (2.2) The application of this
idea to another important problem, the photoelectric effect, not only confirmed the
relevance of this approach, but also guaranteed Einstein the Nobel Prize in Physics of
1921. We will discuss the photoelectric effect and other processes involving photons
in order to understand the instrumental developments that led to the growth of high
energy Astrophysics in the following [3].
2.2 Processes Involving Photons at High Energies (Absorption and Scattering) 21
With the discovery of the photon or quantum of light, our perspective of the interaction
of light with matter has undergone an important change. Although many phenomena
were still well described with the wave formalism, the most elementary processes
were better thought of as interactions between two particles, e.g., an electron, proton,
etc., and a photon. This is not to say that light has ceased to be a wave phenomenon,
nor that the idea of the quantum of light led to contradictions with previous results, but
that in the microscopic world assigning a wave or particle character to an elementary
quantum results in an inadequate and even confusing picture. We do not have any
direct experience of the quantum world, and we have a (human) tendency to imagine
that quantum objects should behave like something we do know from the macroscopic
world, namely, waves and corpuscles.
With this quantum perspective, the study of absorption and scattering processes
that involve photons interacting with matter has produced many important results,
of direct application to the understanding of astrophysical processes and sources as
we will see later. We now review some of these developments from the first half of
the 20th century, then discuss the processes in which photons are emitted.
The young Albert Einstein carried out his research alone for several years while
working in the Bern Patent Office, and besides coming up with the celebrated Special
Theory of Relativity, he produced a simple explanation of the photoelectric effect
using the bold hypothesis of the quantum of light. The basic observation that led
Einstein to the latter was that electrons are ejected when a metal plate is illuminated
with monochromatic light (a result that was already familiar to physicists at the time)
and that there was a maximum speed of ejection that depended on the metal (Fig. 2.1)
What Einstein did next was to take Planck’s photon hypothesis, which he had
adopted as the true physical description of light, to its ultimate conclusion. In fact,
Einstein’s analysis began by admitting that there is a minimum amount of energy
that must be delivered to pull an electron from the metal. This he called the work
function W , which differs for each metal and results mainly from the action of the
electrostatic forces that keep the electrons attached to it. The light that falls on the
plate is assumed to be made up of photons of quantized energy hν. Thus, the energy
available to accelerate the ejected electron is
E = hν − W . (2.4)
The energy of the emitted electron is then measured using a voltmeter that registers
a maximum voltage Vmax , whence a maximum energy eVmax is observed, as already
22 2 Elementary Processes at High Energies
Fig. 2.1 a Photoabsorption cross-section for various materials as a function of energy. b Diagram
showing the emerging fluorescent radiation when the incident photon energy is high enough
pointed out. To check this physical picture, it is enough to vary the frequency of the
incident light and measure the Vmax values for each metal that has a fixed W , that is,
eVmax = hν − W . (2.5)
A simple graph of the relation (2.5) made with the directly measured values of these
physical quantities can be used to determine h/e and thus establish the very “core”
of the quantum hypothesis as a verifiable corollary. Not only is the linearity of the
relation shown to be accurate, but the numerical value of the Planck constant can be
determined. In fact, Robert Millikan measured the latter with an accuracy of around
0.5% after independently determining the charge of the electron, and later received
the Nobel Prize in Physics in 1923 for that work [4].
We can now explain the relevance of the photoelectric effect to high energy Astro-
physics. As we will see below, if the photon energy is very high, other processes will
be important, but for energies close to the typical value of the work function, pho-
toelectric absorption can be used efficiently for shielding effects (for example, to
protect astronauts from radiation). The material used must be heavy, since the opti-
cal depth, a measure of the probability of the photon interacting with the material
and being absorbed, is given by
Z4
τ= σ N A dl ∝ , (2.6)
(hν)3
where the cross-section σ is that of Fig. 2.1 (which shows the “jumps” due to the
electronic layers), N A is the number density of material A, and dl the differential
distance along the line of sight. It is clear that for shielding, for example, against
gamma rays (E ≥ 100 keV), it is best to use material with a high Z value, such as
lead.
In real processes, when the energy E is very high, photoabsorption is followed
by fluorescence, since the electrons thrown out by the photon are from the innermost
2.2 Processes Involving Photons at High Energies (Absorption and Scattering) 23
layers (for example, from the K layers), and soon other electrons de-excite, occupying
the state of the one that was ejected and emitting the energy difference, also in the
form of photons.
In Astrophysics, photoabsorption is often referred to as a bound–free process,
with reference to the initial and final states of the ejected electron, regardless of the
subsequent fluorescent emission that may or may not be present.
The study of the properties of light described in Chap. 1 saw important development
in the early 20th century with the study of X-rays discovered by Röntgen in 1895.
Their name already indicated a total ignorance of their true nature, but it soon became
clear that they were actually highly energetic photons. The work of Barkla, Von Laue,
and Bragg had shown that the X-rays are scattered in matter, but contrary to the
prediction of classical Electromagnetism, their frequency also changes (besides their
being deviated from the direction of the original beam). In 1923, Compton published
a study in which he attributed a momentum to the quanta of light (photons) as if
they were material particles, in total harmony with Einstein’s initial ideas. Thus, the
collision of a photon with an electron, impossible in classical theory, gained reality
in the new Quantum Pysics initiated by Max Planck [4]. In Compton’s work, the
hypothesis of the momentum of the photon was consistent with (but not inspired by)
the ideas of Einstein and Planck, and he proceeded to demonstrate that the frequency
of the initial radiation would have to change when it collided with the electrons of
a gas, using only the conservation of the momentum and the energy in the collision,
as schematized in Fig. 2.2.
Since the Compton process must satisfy conservation of energy and of the two
components of the momentum of Fig. 2.2, we obtain the following three conditions:
m e c2 + hν = E + hν , (2.7)
hν hν
= cos θ + m e γ v cos φ , (2.8)
c c
hν
sin θ − m e γ v sin φ = 0 . (2.9)
c
This is a system of algebraic equations that can be solved to find ν, ν , θ , and φ. After
some algebraic manipulations and setting λ = c/ν, we have
c 1 1 θ
= − −→ λ − λ = 2λC sin2 , (2.10)
h ν ν 2
where the quantity λC = h/m e c is the Compton wavelength of the electron. Numeri-
cally, λC ≈ 2.4 × 10−4 Å is a very small number, and so the change in the frequency
of light is also small. However, it is clear that λ − λ > 0 and the photons lose
24 2 Elementary Processes at High Energies
Fig. 2.2 Basic diagram of the Compton effect. A photon of initial energy hν interacts with an
electron at rest (in its own reference system) and is deflected through an angle θ from the initial
direction, besides changing its energy to hν . The electron acquires a certain velocity, with direction
characterized by another angle φ. The problem here is to calculate the change in wavelength of the
photon as a function of the angle, in order to compare with experimentally observed results
energy which was transferred to the electrons in the scattering process, thus known
as inelastic scattering.
This process can be considered in the low (kB T m e c2 ) or high (kB T m e c2 )
energy limits. In the first, the cross-section cannot be sensitive to the energy of the
photon, since it only “sees” the electron as a target with area of order re2 , where re is
the classical electron radius, related to the Compton wavelength by the fine structure
constant re = αλC . The cross-section at the low energy limit (called the Thompson
limit) should be σ ∝ re2 . The precise expression is
2
8π e2
σT = , (2.11)
3 m e c2
as claimed, apart from the numerical pre-factor 8π/3. In the opposite, high energy
limit kB T m e c2 , the process depends on the incident energy and must be calculated
using Quantum Electrodynamics. This goes far beyond the scope of our discussion,
but we can quote the final result, the so-called Klein–Nishina cross-section (ultra-
relativistic limit), which decreases with energy and is asymptotically smaller than
σT (Fig. 2.3).
An important property of (2.11) is its frequency independence. As a consequence,
within a certain range of energies, the Compton (Thompson) scattering will be the
same for any incident photon.
In Astrophysics, we often have to deal with the inverse Compton effect, where an
ultra-relativistic electron collides with a low-energy photon. This case has analogous
mathematical expressions, and the desired result can be obtained by a simple Lorentz
transformation between reference frames. As an example of this situation, Fig. 2.4
2.2 Processes Involving Photons at High Energies (Absorption and Scattering) 25
Fig. 2.3 Behavior of the cross-section in the Klein–Nishina limit. The decrease is clear for
kB T m e c2 (right-hand region on the axis)
Fig. 2.4 Diffuse gamma emission from the center of the Milky Way for energies E > 100 MeV
[5]. The measured spectrum requires the injection of protons of energy at least 1015 eV, possibly
by the supermassive black hole Sgr A∗ in periods of past activity. A fraction of the emission is
not associated with the inverse Compton effect, but the latter is highly dominant. Credit: HESS
Collaboration
shows the diffuse emission from the center of our galaxy attributed to the inverse
Compton effect by ultra-relativistic protons accelerated and injected by compact
sources in the region of the central source Sgr A∗ .
E = mc2 , (2.12)
a relation which, together with the idea of antiparticles in the quantum realm, led to
the following concept. Consider a (real) photon with high enough energy E. Equation
(2.12) would allow this photon to give rise spontaneously to a particle–antiparticle
pair by converting its energy into the pair’s mass. Such a process is illustrated in
Fig. 2.5.
If ω is the frequency of the photon and γ = 1/ 1 − (v/c)2 is the Lorentz factor,
we can immediately write down the conservation of energy and momentum at the
vertex:
ω = 2γ m e c2 , (2.13)
ω v
2γ m e v = . (2.14)
c c
As the initial momentum of the photon is ω/c, and (obviously) v < c, it is impossible
to satisfy both conditions at once. Thus, the conversion of a photon into a pair cannot
happen. In order to satisfy the conservation laws what is needed is (1) a second
photon annihilating itself with the first one in the initial state (which is possible but
requires a radiation field with an enormous density) or (2) another agent that plays
the role of absorbing the additional momentum (usually a nucleus, Fig. 2.6).
In the presence of a Z -charged nucleus, we can consider the limit of low energies
by comparing the photon energy (in units of the energy of an electron with zero
momentum) with a dimensionless quantity related to the Coulomb energy. The “low”
energies are those that satisfy the condition ω/m e c2 1/α Z 1/3 , or physically those
where the photon has enough energy to create the pair, but not enough to trigger more
complex effects related to the Coulomb field.
Under these conditions, an elaborate calculation using Quantum Electrodynamics
shows that the cross-section for pair production is
2ω 218
σpair = αre2 Z 2 ln − × 104 cm2 . (2.15)
m e c2 27
2.2 Processes Involving Photons at High Energies (Absorption and Scattering) 27
The things to note in this cross-section (2.15) are the quadratic dependence on the
charge of the nucleus Z and the monotonic growth with the energy [3]. We have
already seen that the Compton effect does not depend on the energy in the low
energy limit and that the cross-section decreases at very high energies. Thus, it is
inevitable that there should be a threshold above which pair production dominates
the total cross-section of the photon interactions with matter.
In the limit of high energies ω/m e c2 1/α Z 1/3 , the cross-section has a similar
expression, with a weak dependence on the charge Z . This regime is hardly relevant
and we shall not show the corresponding expressions.
Given the three processes discussed above, we can visualize the total result by intro-
ducing the linear attenuation coefficient as follows. We consider a beam initially
with N0 photons crossing a material of density ρ. There is a certain probability of
interaction with the matter, which means that on average N of them will survive after
traveling a distance x, where
N = N0 e−μx , (2.16)
As a result of the different energy dependencies, the curves for the attenuation
coefficient per unit mass behave as shown in Fig. 2.8. Although calculated for Pb,
the shape is similar for any other element or compound substance. Thus, we can
discuss the construction of photon detectors, whose interactions with matter we now
know, and choose materials and configurations according to the range of energies
we wish to observe. Essentially all these detectors need to be operated in space,
since attenuation by the atmosphere, described by the same formalism, is otherwise
inevitable. Note that all instruments already built and operated have used one of the
forms of absorption discussed above.
2.3 Relevant Processes for High-Energy Photons (Emission) 29
While we have been discussing the fate of photons that find matter in their path, it is
equally important to worry about the emission of photons by matter. In fact, this is
precisely what we observe from high energy astrophysical sources. Thus, knowing
the mechanisms of radiation emission is equivalent to obtaining a diagnosis of the
physical conditions of the environment in which it originates.
Basically, the emission processes are classified as coherent or incoherent. The
former have a very particular physical characterization: the particles emit collectively
and the final amplitude reflects this collective character; usually it is proportional to
the square of the number of emitters N 2 . Incoherent processes, on the other hand,
sum up random amplitudes of particles that emit individually, without correlation
with their neighbors. We can study different astronomical sources by understanding
the types of emission they produce.
Black body radiation is the most basic example of incoherent emission. The term
“black body” is the name used for a perfect emitter (and absorber), regardless of the
actual “color” presented [4]. In fact, we have already considered this concept when
discussing the origin of the quantization of light (photon). The spectral distribution
per unit frequency is given by (2.3). An inspection of this expression (or in fact,
a simple calculation) shows that there is a maximum of the function B(ν, T ) for a
certain frequency proportional to kB T , and that the wavelength where the maximum
occurs for a given temperature satisfies
This is known as Wein’s displacement law. This expression shows that the wave-
length of the dominant radiation (i.e., the one most present in the incoming light) is
inversely proportional to the physical temperature of the body. As the wavelength
and frequency are inversely proportional, astrophysicists say that the body is “bluer”
when its temperature increases, since the emitted radiation is typically more energetic
(or “harder”, because the photons are more energetic).
One of the most important characteristics of black body radiation is that the
emission is independent of the composition, and is proportional to the fourth power
of the temperature. In other words, the energy per unit area and time (flux F) is given
by
F = σT4 , (2.18)
where the proportionality constant σ = 5.67 × 10−5 erg cm−2 s−1 K−4 is called the
Stefan–Boltzmann constant. As the emission is isotropic, the power, often called the
30 2 Elementary Processes at High Energies
Note that we will use the name “luminosity” instead of “power,” which is the most
common term in Physics courses, because the two are exactly the same and the former
is standard in Astrophysics.
In most real situations, we will only have an idea of the luminosity of a source
if there is a reliable estimate of its distance, since the flux is directly measurable
by collecting the radiation that arrives from the object and identifying where the
maximum emission is, as already discussed. As it is never possible to measure across
the whole spectrum, we will almost certainly need to estimate how much we are
leaving out, especially in the high energy bands, where a large part of the radiation
can flow, outside the observed spectral range of our instruments.
Z e2
F = ma ≈ − , (2.20)
x2
that is, the magnitude of the acceleration is a ≈ Z e2 /mx 2 . Although the actual
collision lasts quite a long time, the deceleration of the electron is effective only
when it is very close to the nucleus, i.e., for a time t ∼ 2b/v. The emitted power
will be significant only during this time, and we can disregard the rest of the total
collision time. This emitted power can be calculated using the Larmor formula, viz.,
dE 2 e2 2
P=− = a . (2.21)
dt 3 c3
2.3 Relevant Processes for High-Energy Photons (Emission) 31
Taking into account the effective duration already indicated, we see that the emitted
radiation consists of a “pulse” with energy
2
2 e2 Z e2 2b 4 Z 2 e6 1
Pt = × = . (2.22)
3 c3 mx 2 v 3 c3 m 2 b3 v
The more precisely a pulse is located in time, the broader the frequency distribu-
tion (a result due to Fourier). The power emitted above a certain limit frequency
is determined by the reciprocal of the pulse duration. The situation is illustrated in
Fig. 2.9.
This value νmax corresponds to the maximum energy of the colliding electron. If
we divide (2.22) by the frequency interval ν, we obtain the energy radiated per
frequency interval in each collision:
Pt Pt 16 Z 2 e6 1
≈ = . (2.23)
ν νmax 3 c3 m 2 b2 v 2
These results refer to a single electron, but in realistic situations we must consider
a set of electrons and ions with a distribution of initial energies. Let us consider the
case of a “cloud” composed of electrons and ions (the latter all with charge Z for
simplicity). Between two impact parameters b and b + db there will be a number of
ions equal to 2π n Z vbdb, where n Z is the number density of the ions (Fig. 2.10)
As the collision processes are totally independent from each other, the differen-
tial of the total number of collisions can be obtained by multiplying the previous
expression by the number density of the electrons n e , i.e., 2π n e n Z vbdb. Thus, we
can integrate over all “rings” of radius b and obtain the total emissivity
32 2 Elementary Processes at High Energies
bmax
16 Z 2 e6 1
I = 2π n e n Z v bdb
bmin 3 c3 m 2 b2 v 2
16 π Z 2 e6 n e n Z bmax db
=
3 m 2 c3 v bmin b
16 π Z e n e n Z
2 6
bmax
Z AS = ln . (2.24)
3 m 2 c3 v bmin
Note also that this result applies to a single particle velocity v, so we should integrate
over v to include all the velocities present in the distribution. But before doing
so, we will define the limits of integration of the expression (2.24) using physical
considerations [3].
The value of bmax stems from the fact that the relevant interactions have impact
parameters corresponding to ν < νmax (Fig. 2.9 right). We thus have bmax < v/4ν. On
the other hand, the minimum value of the impact parameter depends on the nature of
the collision, since it may correspond to the classical domain or even be determined
by Quantum Physics. In the first case, the validity of Newton’s law is guaranteed
by the condition (Z e2 /mb2 )(2b/v) < v, which implies bmin C ≥ 2Z e2 /mv 2 , where
the subscript C reminds us that we are considering a classical scenario. But if the
collision is highly energetic, the classical condition may not apply. In this case the
maximum approximation will be given by the uncertainty relation p × x ≥ ,
where the uncertainty in the position must be identified as the order of the impact
parameter b. Thus, there is a bmin Q given by bmin Q ≥ /mv. The presence of the
Planck constant is a reminder of this situation, since this last bmin would be zero if
the quantum of action was zero.
In most real cases in Astrophysics, electrons in the cloud have a classical velocity
distribution, where we know that the characteristic
√ velocity depends on the square
root of the temperature according to v = 3kB T /m. In this case the expression
of the quotient in the logarithm of (2.24) can be simplified, and results to a good
2.3 Relevant Processes for High-Energy Photons (Emission) 33
√
approximation in bmax /bmin ≈ (137/Z c) 3kB T /m. √ This numerical factor is usually
built into the definition of the Gaunt factor: gff = (3/π ) ln(bmax /bmin ), a quantity
always close to unity in free–free processes (with subscript ff). But to obtain the
functional dependence of the emission, it is not enough to consider the typical value
of the speed of the electrons, since there are collisions of electrons in the extremes
of the distribution that can contribute in an important way. We then need to consider
the whole distribution, which in the classical case is simply
3/2
m mv 2 2
f (v)dv = 4π exp − v dv . (2.25)
2π kB T kB T
We see that electrons with energies much greater than the average kB T suffer an
exponential suppression, that is, they are progressively less present in the distribution.
This expression should replace n e , implicitly assumed to be monoenergetic in (2.24),
and it should be integrated to evaluate the contribution of all electrons. The result,
after inserting the numerical values of the physical constants, etc., is
−1/2 hν
I = 6.8 × 10 T38
exp − n e n Z Z 2 gff (ν, T ) erg s−1 cm−3 Hz−1 . (2.26)
kB T
We see that in the final result the factor corresponding to the power of a single electron
(T −1/2 ) is multiplied by Planck’s cutoff exp(−hν/kB T ), and also by the densities of
electrons and ions (assumed spatially homogeneous).
An electron and ion plasma that emits bremsstrahlung radiation has a character-
istic cooling time in which it loses its energy E precisely by emitting the radiation.
This time is
E 6 × 103 1/2
τ= = T yr . (2.27)
I n e g ff
F = qE + q(v × B) . (2.28)
It is evident from the presence of the vector product in the second term that the
component of the magnetic force is always perpendicular to the direction of B, and
thus does not do work, but deviates the trajectory and thus contributes to accelerating
the particle by changing its direction.
To better understand this behavior we can simplify the situation by ignoring the
effect of the electric field, i.e., setting E = 0. This corresponds exactly to astrophysi-
cal situations, where it is almost impossible to generate a non-zero electric field (free
charges would quench any such field almost instantaneously). We also assume that
the magnetic field is uniform in the direction of the z axis (Fig. 2.12). For strong
enough fields B, the particle paths are spiral, and the radiation produced at the expense
of the particles in the B field emerges tangentially to the paths. This physical charac-
teristic can be used to generate synchrotron radiation in the laboratory, by producing
fields that confine electrons to move around a ring. Several experiments can then be
mounted in tangential tunnels that make use of the emerging radiation (Fig. 2.12b).
2.3 Relevant Processes for High-Energy Photons (Emission) 35
Fig. 2.12 Trajectories of an electron in a constant magnetic field in the z direction (a). Synchrotron
accelerator (b). Note that the trajectories are circles as long as the momentum of the electron in
the z direction is zero, which is guaranteed by the experimental setup. The tunnels contain targets
of interest that are radiated by the synchrotron radiation discussed here. The world-class SIRIUS
machine was inaugurated in Campinas (Brazil) in November 2018
We are interested in knowing the radiated power and the emerging radiation spec-
trum. We start by studying the motion of the charged particle in the B field, which
satisfies the (relativistic) equation
d(m 0 vγ )
= qE + q(v × B) . (2.29)
dt
This is analogous
to Newton’s equation, but with the presence of the Lorentz factor
γ = 1/ 1 − (v/c)2 . Note that (2.29) is expressed in SI, not cgs units. As already
pointed out, the particle has helical motion (Fig. 2.12a) with fixed pitch (launch) θ0 .
The cyclotron frequency, which is the number of turns per unit time, is then
|q|B
ωg = . (2.30)
γ m0
dE γ 4 e2 2 γ 4 e2 B 2 v 2
− = |a | = sin2 θ0 . (2.31)
dt 6π 0 c3 ⊥ 6π 0 m e c3
If we define the magnetic energy density Umag = B 2 /2μ0 and consider the non-
relativistic limit γ → 1, (2.31) becomes
36 2 Elementary Processes at High Energies
Fig. 2.13 Angular pattern of synchrotron radiation in the low energy limit (a) and the ultra-
relativistic limit (b)
dE v2 σT
− = 2σT cUmag 2 sin2 θ0 = 2 Umag v⊥
2
. (2.32)
dt c c
This is known as the cyclotron power (low energies). In the ultra-relativistic limit,
on the other hand, the expression becomes
dE v2
− = 2σT γ 2 cUmag 2 sin2 θ0 , (2.33)
dt c
as a function of the pitch angle of the electrons. The angular pattern of the radiation is
very much affected by the ultra-relativistic motion as compared with the low-energy
limit (compare Figs. 2.13a,b).
Instead of considering an individual pitch, we can take an angular average over
the momenta of the form p(θ0 )dθ0 = sin θ0 /2, and write the power as
dE 4 v2
− = σT γ 2 cUmag 2 . (2.34)
dt 3 c
The spectrum of each electron thus has a peak at ω = ωg and increasingly wide
harmonics, until it fades and becomes a continuous (envelope). This situation is
shown in Fig. 2.14.
It is now clear how to proceed to obtain the total emission from a population: if we
have the electron density n(E, r ), the spectral density (total intensity per frequency
interval) results from integrating spatially and in energy over all the contributions:
dI (ω/ωg ) E max R
dE
= − n(E, r ) dE dr . (2.35)
dω 0 0 dt
An important and quite common case is that of an astrophysical source that injects
electrons, accelerated by some mechanism, with a power law-type energy distribu-
tion, that is, n(E, r ) ∝ E − . Equation (2.35) can be immediately integrated to show
that I (ω) ∝ ω−α , with α = (1 − )/2. Observation of such a radiation distribution
2.3 Relevant Processes for High-Energy Photons (Emission) 37
Fig. 2.14 Main peak (the maximum on the left) and harmonics of the synchrotron emission due to
an electron. The full line shows the total emission
Fig. 2.15 Crab Nebula (left) and the spectral energy distribution (right), showing the regions where
the synchrotron radiation is evidenced by its power law form [9]. Left credit: G. Dubner (IAFE,
CONICET-University of Buenos Aires) et al.; NRAO/AUI/NSF; A. Loll et al.; T. Temim et al.; F.
Seward et al.; Chandra/CXC; Spitzer/JPL-Caltech; XMM-Newton/ESA; and Hubble/STScI. Right:
© AAS. Reproduced with permission
shows immediately that the population of injected electrons is not thermalized, since
only a power law produces an I (ω) of this form. Hence, the source is transpar-
ent to the passage of electrons, which never interact enough to achieve a thermal
distribution.
The best known case is probably the Crab Nebula (Fig. 2.15 left), where a young
and energetic pulsar injects electrons into the environment, which in turn produces
38 2 Elementary Processes at High Energies
synchrotron emission when the latter move through the enormous magnetic field of
the pulsar.
A closely related process is the so-called curvature radiation, in which the charge
trajectories bend and then radiate for the same physical reasons (in fact, in the extreme
relativistic limit the composit term synchro-curvature is found, in which both con-
tributions occur together). We will not deal with curvature radiation here, and refer
the reader to Longair’s book for further discussion [3].
In the first half of the 20th century, the Russian physicist P. Čerenkov studied a
phenomenon that gave rise to the radiation that now bears his name. Čerenkov realized
that, when a particle moves through a material medium (water, plastic, etc.), it can
travel at a greater speed than light in that medium (but still less than c in vacuum).
It is enough to remember that the simplest definition of the refractive index involves
the quotient of the velocities of light in the medium in which it propagates. Thus,
and in a manner completely analogous to the formation of a sonic shock wave (very
common on the sea surface, for example), the resulting wave fronts at each point
must combine to produce a shock wave front, and the energy transferred through it to
the medium in the form of the excitation of the molecules produces electromagnetic
radiation. The analogy is shown in Fig. 2.16.
Fig. 2.16 Sonic analogue of Čerenkov radiation and its basic geometry. Left: Boat traveling at a
speed greater than the speed of sound in water. The shock front formed by the combination of the
spherical wave fronts along the path is clearly visible. Right: Charged particle (at the lower vertex)
traveling at a speed greater than the speed of light in the medium. The combined wave fronts excite
the molecules that radiate when de-excited and produce the Čerenkov light
2.3 Relevant Processes for High-Energy Photons (Emission) 39
2π(c/n) c
ω = 2π ν = =k . (2.36)
λ n
As the refractive index of water is n ≈ 1.3, we see that the change in the dispersion
relation from the one that in the vacuum, ω = kc, allows us to infer the possibility
of this kind of radiation.
The characterization of the so-called Čerenkov cone is simple and purely geomet-
rical (Fig. 2.17). If the velocity of the charged particle is v, the distance traveled in
an interval t is simply lpart = v × t. In the same interval, the spherical front from
the point where the particle was originally located has radius llight = (c/n) × t.
Thus, the angle θC of the cone opening is
(c/n) × t 1
cos θC = = . (2.37)
v × t βn
We see that there is a physical requirement for this cone to form: as cos θC must be a
function with real values, the condition β ≥ 1/n must necessarily be satisfied. This
is the kinematic condition (imposed on the speed of the particle) to observe Čerenkov
radiation.
The intensity of the radiation depends on the excitation of the molecules, and
therefore on the charge on the nuclei in the given medium, besides depending on the
geometry itself, as determined above. Using these data, we can calculate the number
of photons emitted per unit wavelength and per unit length along the path:
40 2 Elementary Processes at High Energies
d2 N 2π Z 2 1
= sin2 θC ∝ 2 . (2.38)
dλdx λ2 λ
Thus, the largest number of photons will be in the region of smaller λ, and we
therefore expect the color of the Čerenkov radiation to be in the visible blue band.
This expectation is confirmed in the image of Fig. 2.18.
Čerenkov radiation is an important tool nowadays for the construction of detectors.
For example, water tanks are used to measure the radiation produced by the passage
of muons in cosmic ray showers (as we will see in Chap. 12). One can then reconstruct
the energy and direction of arrival of the primaries.
To conclude this chapter, we now refer to the presence of radiation at definite fre-
quencies, instead of spectral distributions, as was the case with each of the previous
processes. The simplest and most frequent case is that of electron–positron annihi-
lation lines, which is nothing other than the inverse of the pair production previously
treated. For this to occur, some astrophysical source must eject the positrons, since
the interstellar medium contains electrons in abundance.
What kind of mechanism can inject these positrons? A fairly common situation is
that of an accelerator (Fig. 2.19) that can accelerate protons to high enough energy.
2.3 Relevant Processes for High-Energy Photons (Emission) 41
Fig. 2.19 An astrophysical accelerator injecting protons in the medium, producing positrons in the
final state
The collision of these protons with surrounding material (nuclei) produces pions
in the final state. The neutral pions π0 give rise to gamma pairs (process shown
in the upper part of Fig. 2.19), but the positively charged pions π+ decay almost
immediately into a positron and neutrinos. The latter positrons are the candidates to
annihilate with electrons in the medium. In the case described, these positrons are
actually “grandchildren” of the injected protons, so their number and the number of
gammas must be correlated.
Figure 2.20 shows the electron–positron annihilation line observed from the direc-
tion of the center of the galaxy. The center of the line is practically at 511 keV, a value
that corresponds to annihilation with a momentum of approximately zero. There is
an interesting proposal for the origin of this line, besides injection by accelerators
(still to be identified): the positrons could result from the decay of a particle that
composes dark matter, if the latter happens to be unstable. There is no consensus or
proof of this hypothesis so far, but it is certainly one of the most interesting problems
that remains to be solved in high energy Astrophysics.
Another conspicuous and well identified source that produces positrons that sub-
sequently produce the annihilation line is the decay of the aluminum isotope 26 Al.
This isotope is produced in large amounts in explosive nucleosynthesis in super-
novas. After ≈ 106 yr, it decays sequentially into 26 Mg, producing a gamma-ray and
a positron that annihilates. The presence of the annihilation line is well known in old
SN remnants (see Chap. 6).
Electron–positron annihilation lines (and other lines in the spectra) do not always
appear with the “standard” shape and in the expected position, such as the line in
Fig. 2.20. There are several effects that can change the position and width of the line,
the most important for our purposes being the Doppler effect and the gravitational
redshift. The Doppler effect produces a broadening of the line and can produce a
shift in its “natural” position if the emitter is moving in the direction of the line of
sight. In the non-relativistic limit, the frequency difference ν satisfies ν/ν = v/c,
and can shift the line toward the red or the blue, depending on whether the emitting
material moves away or toward the observer, respectively. In the relativistic case
the expression is more complicated and there is also a transverse Doppler effect, in
42 2 Elementary Processes at High Energies
Fig. 2.20 Left: Annihilation line at the center of the galaxy. Right: The same phenomenon in the
AMS-02 experiment [10]. The positron flow (red) far exceeds the expected proton production in
the interstellar medium (green). Are there proton accelerators in this region or is it evidence of the
decay of dark matter in the galactic bulge? A viable model is shown on the right with a full black
line. Credit: AMS-02/CERN
which the velocity component perpendicular to the line of sight also produces a shift
in the frequency, although only a very small one.
Finally, gravitational redshift occurs when light escapes from a region where there
is a very strong gravitational field. In fact, photons lose energy through the action of
the field that “pulls” them in the central direction (Fig. 2.21). If emission occurs at
the source with frequency ν, the observer “at infinity” will detect a lower frequency
G M 1/2
ν∞ = ν 1 − . (2.39)
Rc2
We see that measurements of frequency that differ from the “natural” value (for exam-
ple, 511 keV in the case of electron–positron annihilation) can reveal characteristics
of the object that produces the field (more specifically, the mass-to-radius ratio). It is
important to clarify that this displacement is not related to the cosmological redshift,
of a very different origin and interpretation.
References 43
References
1. S.F. Mason, A History of the Sciences (Collier Books, New York, 1973)
2. A. Pais, Subtle Is the Lord: The Science and the Life of Albert Einstein (Oxford University
Press, Oxford, 2005)
3. M. Longair, High-Energy Astrophysics (Cambridge University Press, Cambridge, 2011)
4. R. Eisberg, R. Resnick, Quantum Physics (Wiley, Hoboken, NJ, 2004)
5. H.E.S.S. Collaboration, Acceleration of peta-electronvolt protons in the galactic centre. Nature
531, 476 (2016)
6. W.R. Hendee, E.R. Ritenour, Medical Imaging Physics (Mosby-Year Book, St. Louis, 1992)
7. MIT OpenCourseWare, https://ocw.mit.edu/courses/nuclear-engineering/22-101-applied-
nuclear-physics-fall-2006/
8. U.G. Briel et al., A mosaic of the Coma cluster of galaxies with XMM-Newton. Astron.
Astrophys. 365, L60 (2001)
9. F. Aharonian et al., The Crab Nebula and Pulsar between 500 GeV and 80 TeV: Observations
with the HEGRA stereoscopic air Cerenkov telescopes. Astrophys. J. 614, 897 (2004)
10. AMS02, https://ams02.space/de/node/474
Chapter 3
Detection and Instrumentation
in High-Energy Astrophysics
To study sources in High Energy Astrophysics there are three different domains,
all of which reveal their nature and physics through observations. When we say
“domains” we are referring to the observed quantities, and more particularly, how
these depend on the variable that characterizes them (position, energy, or time). The
basic problems and techniques of X-ray and gamma-ray instumentation and their
solutions are briefly presented here.
The temporal domain consists in the study of the variability of some quantity
measured as a function of time. Typically, the photon count is measured in some
frequency range (for example, the whole band covered by a detector, or some more
specific channel of it), and the graph as a function of time reveals the temporal history
of the emission. By analogy with optical astronomy, this type of graph is called a
light curve (although the “light” here typically refers to X-rays or gamma rays). An
example of this type of observation is shown in Fig. 3.1 (right).
The light curve has the potential, among other possibilities, to reveal the physical
size of the actual emission zone. The reasoning is as follows: if τmin is the smallest
time scale observed in the light curve, the region of emission is limited to a size
R ≤ c × τmin , otherwise we would be in the presence of an emission that violates
causality. We can say that, within a scale R, the smaller elements of the source can be
in causal contact to produce the emission, but not those outside this scale. Thus, for
example, a variability of, say 0.1 ms implies a maximum size for the emission region
of about 3 km. We see that this points directly to a compact object as a source of
radiation, although the whole object may not necessarily participate in this emission.
There may be, for example, just a “hot spot” emitting, or some other limited region
of the system.
Another important characteristic of sources is revealed by their spatial location
(spatial domain). This is particularly relevant when studying binary systems, since
it is important to know which component is emitting, whether it is the donor (sec-
Fig. 3.1 Light curve of a variable object (in this case, a gamma-ray burst) on the right, and inference
of the physical size of the source (left, see text)
Fig. 3.2 Left: The source Cyg X-1, obtained with the HERO experiment on board a balloon with an
angular resolution of a few minutes of arc (arcmin). Credit: NASA/Marshall Space Flight Center.
Right: The same source observed with the Chandra satellite, reproduced on approximately the same
scale, where the resolution reaches about 1 arcsec. Credit: NASA/Chandra X-ray Observatory
ondary), the receiver (primary), or the gas that passes from one to the other (accretion
disk/shock), and for many other cases, in other real sources. The better the spatial
resolution of the source, i.e., the ability to locate photons spatially, the better we can
understand the geometry of the source (although for very compact or very distant
sources this separation may be impossible). With some license, we could equate the
spatial resolution to the “sharpness” of an optical image. In the following, we will dis-
cuss the difficulties of focusing when the photon energy is very high, but for now it is
enough to point out that the first X-ray detectors only saw “spots” of various degrees,
while the most modern observatories work with a resolution comparable to optical
telescopes, but on X-rays (of order one second of arc, or arcsec). An example of the
substantial improvement in the spatial resolution over time of X-ray observations is
shown in Fig. 3.2.
The last domain is the spectral domain, represented graphically with the photon
energy on the abscissa. This is used to study the way the source distributes the emitted
energy, data which can be linked to physical characteristics through models. Here it is
3.1 Spatial, Spectral, and Temporal Domains 47
the spectral resolution (R = E/E) that matters. Hence, the most convenient thing
is to be able to determine the energies of precisely detected photons. Any particularity
in the spectrum (for example, the presence of spectral lines) can be used to make
a diagnosis. In addition, one may be able to separate various emission components
associated with different processes, perhaps originating in different regions of the
source. Figure 3.3 shows two examples of resolved spectra from known sources.
Instruments are always built to detect a range of energies, with good time res-
olution (usually determined by on-board electronics), and an appropriate angular
resolution, although the latter presents some difficulties that we will discuss below.
Another characteristic of great importance is the collecting area of the instrument,
since sources are often weak in the X and γ bands, and the limited numbers of
photons arriving from them must be used as efficiently as possible. This makes it
important to have large collecting areas, but taking into account compromises with
other intrinsic characteristics of the instrument (see below).
48 3 Detection and Instrumentation in High-Energy Astrophysics
One of the most important advances in optical astronomy, developed and popularized
very rapidly since 1970, is the solid state detector known as a charge-coupled device
or CCD [2]. The device consists of a large number of photon-sensitive zones called
pixels, used to form a spatially accurate image of a region. It can also obtain the
distribution of incoming photon energies and perform spectroscopic analysis that
is essential to understanding the physics of the emission. Today most people are
familiar with pixels because they actually own digital television sets and cameras.
Although these commercial CCDs are much less reliable than the scientific ones and
have many more defects, they work in the same way.
CCDs are made of a semiconductor (usually silicon), while the pixels are deter-
mined by the position of the electrodes above them, as indicated by I01, I02, and
I03 in Fig. 3.4. A positive voltage is applied to the electrodes (as shown for I02 in
the figure), and the resulting electric potential attracts electrons to the area below
the electrode (little blue balls in the figure), while the positively charged holes are
repelled (little red balls in the figure). Thus a potential well is generated where the
electrons produced by the incoming photons accumulate. As more photons arrive, the
well accumulates electrons until it fills up completely. It is important not to exceed
this limit. The signal must be integrated (i.e., allowed to accumulate), but it must
not exceed the capacity of the well, because the image to be generated would be
distorted (astronomers speak of a saturated image). The most common type of CCD
in astronomy has 1024 × 1024 pixels, although special configurations can be made.
Taking into account the fact that an ordinary pixel is around 10–20 μm, the physical
size is about 2 cm2 . Depending on the application, the pixels can be made much
larger and associated in large mosaics.
Each CCD pixel is affected by three electrodes (see Fig. 3.4). We have already
talked about the need to create the potential well, but the other two are needed to
transfer the accumulated charge out of the device. For this, each electrode is kept at
high and low voltage alternatively, to transfer the charge to the neighboring pixel in
row or column mode depending on how the electrodes are oriented. For this reason
(the transfer of charge from one pixel to the next and so on until the end), it is said that
the charges are coupled, and hence the name CCD. The final reading of each pixel
is taken by means of an amplifier that converts the accumulated charge into voltage,
typically a few μV for each one. This way, even a voltage of a few volts requires
the reading of about 100,000 units in each pixel. A CCD camera thus consists not
only of the CCD, but also of the associated electronics which reads all the pixels,
removes noise (electrons that have nothing to do with the source), and digitizes the
signal (the CCD is in fact an analog device), and software to analyze the data and
create images from them.
An important parameter in the detections is the so-called quantum efficiency, that
is, how many photons are actually detected for every 100 that occur. The human
eye is a very inefficient detector, capturing only around 20% of the incident photons.
Photographic films are even worse, with efficiency around 10%. But CCDs can easily
achieve efficiencies of 80% or more (depending on the measured wavelength). This
characteristic, together with its mechanical robustness and simplicity of operation,
justify its enormous and rapid acceptance in astronomy, even more so when consid-
ering sources that only emit a small number of photons that have to be exploited to
the maximum.
Optical CCDs are generally sensitive to photons of the whole visible spectrum
and some part of the infrared. But it is possible to manipulate the construction to
extend the sensitivity to shorter wavelengths into UV and X-rays. It is precisely this
feature that interests us.
To obtain complete data from an observed region it is important to know the range
of source intensities (i.e., the faintest and brightest) that can be recorded. The detec-
tion threshold is usually placed some 3σ above the device noise (in the statistical
sense of signal significance). As already pointed out, the maximum intensity is that
which “fills” the potential well without saturating it, typically around 100 000 elec-
trons. The observation range is defined, taking into account the fact that in the very
process of reading the pixels an irreducible noise is generated (generally a small
count of 3–4 electrons). The quotient of the maximum to the minimum counts is
called the dynamic range.
The whole discussion so far has remained quite general, but our specific interest
in the higher energies deserves special consideration. Optical CCDs have a very
desirable feature for any type of electronics: they are linear devices, where one
incident photon produces one electron. Thus, the reading is directly proportional to
the number of incident photons. However, when we deal with X-rays, this is no longer
true. The incident photons are much more energetic and each produces a multiplicity
of electrons, in greater numbers the higher the energy (typically between 100 and
1000 electrons). CCDs should thus operate in the region of non-linear response,
which is not a disaster, but requires a more sophisticated treatment to produce the
final images, since the intensity is no longer directly proportional to the number of
incident photons. Another difference is that, although the efficiency increases with
pixel size, it is not possible to take advantage of this trend for X-rays, since the
charge fatally begins to be deposited in more than one pixel, thus losing spectral
resolution (even though methods can be applied to recover the maximum amount of
information).
50 3 Detection and Instrumentation in High-Energy Astrophysics
We see that the spectral resolution R = E/E and also the temporal resolution
t suffer from limitations when using CCDs. Additionally, the type of telescope is
important for detection, as we shall now discuss.
When dealing with optical refracting telescopes there are several ways to focus the
light, two of which are shown in Fig. 3.5. In fact, Newtonian and Cassegrain foci are
widely used, and due to the relative scale of the wavelength of the observed light,
there are no technological problems that prevent an accurate focus.
However, in the treatment of X-ray telescopes this issue takes on much greater
importance, since the wavelength is much shorter and focusing is more difficult. In
other words, an image of any X-ray object will be “out of focus” unless we can build
telescopes that solve this focusing problem. For these purposes, it was necessary
to explore the basic physical properties of hard photons and build new designs that
allow efficient imaging.
The first physical property required is Snell’s reflection law. An incident ray from
the vacuum with refractive index n 1 on top of a reflective material that has refractive
index n 2 is subject to a geometric deviation in the transmission that satisfies
n1
sin θT = sin θi , (3.1)
n2
where θT and θi are the transmission and incidence angles, respectively. Since n 1 > n 2
because photons arrive from the vacuum onto the reflective material, there will be
real values for transmission only in the case of angles θi < arcsin(n 2 /n 1 ). Choosing
the reflective material, e.g., gold with n 2 = 0.99, there will only be transmission
for angles θi < 81.9◦ . If incidence occurs at a higher angle, Snell’s law will not be
satisfied and there will be no transmission (see Fig. 3.6).
This feature is exploited by building a grazing incidence reflector arrangement,
hereafter denoted by GI (Fig. 3.7). In each set of reflectors, light from the source is
deflected until it can focus on the detector [5]. In fact, there are several complications
in this design that we will not address here. The important thing is that this shows
how the X-rays can be focused.
Fig. 3.6 Reflectance as a function of energy (horizontal axis) for three different values of the
incidence angle
Fig. 3.7 Basic scheme for a design using a GI setup for X-ray telescopes
There is second popular method for focusing X-rays, using a more conventional
incidence arrangement. This consists in stacking 50–500 alternating layers of plat-
inum and carbon or tungsten and silicon, or any other combination of high-Z layers
interspersed with low-Z layers. The aim is to take advantage of the constructive
interference of the fronts that satisfy the Bragg condition
2d sin θ = nλ , (3.2)
where d is the separation of the ions in the solid lattice, θ the angle of incidence,
λ the wavelength of the radiation, and n an integer. The central idea is to get the
radiation beams to acquire a phase difference of 2π and thus add up the intensities
when reflected in the multi-layer “sandwich” (Fig. 3.8). As we have already said,
the incidence need not be shallow, and with careful construction the reflectance can
exceed 80%. This solution is referred to as a normal incidence (NI) setup.
Finally, when the energy of the incident photons increases still further, there will
be no way to focus with these settings. Thus, photons with energies above 100 keV
are detected using the scintillation technique. Scintillation (Fig. 3.9) consists in
detecting the emission of secondary light as a result of the interaction of the high-
energy photons from the sources when they are absorbed in a crystal or liquid.
Photomultiplier cells convert the emitted light into a current that can be read and
translated into the primary photon energy.
52 3 Detection and Instrumentation in High-Energy Astrophysics
This technique can be used to oberve photons of up to 20–30 MeV (see next
section). But far above this value, in the TeV range or higher, one must work with the
Čerenkov technique. We have already seen in Chap. 2 that the passage of a charged
particle through a material medium excites the molecules in that medium, whereupon
they decay and emit light. But a photon also produces a similar effect, although the
spatial pattern of radiation is different and can be distinguished. Thus, the Čerenkov
effect provides a tool for studying very high energy gamma sources such as AGNs
(Chap. 8).
The beginning of X-ray Astrophysics (the “softest” photons in the high energy region)
had to await the development of space technology. Shortly after the launch of Sputnik
in 1957, a group of scientists led by Riccardo Giacconi launched an X-ray detector
on board an Aerobee 150 rocket and discovered the first X-ray source outside the
Solar System, called Sco X-1. The angular resolution was so poor (worse than 20◦ )
that it took time to identify the “spot” in X-rays with the constellation of Scorpius.
Today we know that it is a neutron star with a low mass companion (LMXB), the first
3.4 Space-Based Instruments for X-Ray and γ-Ray Detection 53
example showing how the sky is populated with high energy sources. A chronology
of the most relevant missions can be found at the website [6].
A few years later in 1970, the first satellite dedicated to X-ray sky exploration
was launched. This was the Uhuru mission (meaning “freedom” in Swahili), with
an effective area of only 0.084 m2 and coverage in the 2–20 keV band, capable of a
spatial resolution of about 0.5◦ . Uhuru identified more than 300 sources, among them
Cyg X-1, the first black hole candidate in our galaxy. Over time, several missions have
been launched, some of them still in operation, and the exploration of X-ray sources
has continued on a sustained basis. Table 3.1 shows some of the most important space
missions in X-ray astrophysics, along with their most important features (collecting
area, spatial resolution, and energy band).
As previously mentioned, one must use scintillators to study still higher energies.
The Compton effect described above was the basis of the COMPTEL instrument,
containing liquid scintillators above a NaI crystal. The successive interactions of
the incident photons until they are finally absorbed can be used to determine the
direction of their arrival without having to actually focus them in the conventional
sense. The coded mask technique was developed to improve the spatial resolution,
which is made much more difficult by the fact that gamma rays cannot be focused.
This is an advanced variant of the camera obscura used in the Renaissance. It was
54 3 Detection and Instrumentation in High-Energy Astrophysics
first proposed in the 1960s. The photons pass through a series of holes of known
pattern in a mask, thus forming a “shadow” in the plane of the scintillation detector
(Fig. 3.10). Algorithms process this “shadow” image, using the fact that its exact
geometry is known, and the image can be very substantially improved. This method
is used to image sources in gamma energies [7].
The most recent example of this type of instrument, still in operation, is the IBIS
imager in the INTEGRAL mission, which achieved an angular resolution of about
12 arcmin for energies of around 10 MeV. This makes a huge difference when it
comes to identifying sources individually, although an even higher resolution would
be desirable (Fig. 3.11).
Finally, for even higher energies, the technique must be changed once again to take
advantage of the Čerenkov effect, detecting the radiation produced by the passage of
3.4 Space-Based Instruments for X-Ray and γ-Ray Detection 55
Fig. 3.12 The MAGIC telescopes in the Canary Islands, built in 2004 to study cosmic rays at the
highest energies [9]. Credit: Giovanni Ceribella, Max Planck Institute for Physics, Munich
particles through the atmosphere or water. The MAGIC telescopes shown in Fig. 3.12
exemplify the use of this type of instrumentation, operating between 25 GeV and
30 TeV, with a large collecting area (and the addition of a second, similar instrument
for stereoscopic data acquisition). Essentially all energies of this level can be detected
using the Čerenkov effect, and in Chap. 12 we will see how it can be combined with
detectors of another type, in fact, water tanks, to study cosmic rays.
We have addressed the main features of photon radiation arriving from high-
energy sources, but in the 21st century there are new probes of non-electromagnetic
character that can reveal the high energy universe and its sources. In Chap. 9, we shall
discuss neutrino astronomy and in Chap. 10 the recent field of gravitational waves.
When combined with the “old” techniques of photon astronomy, these new signals
provide a very powerful integrated view of sources and phenomena. It can be said
that 21st century High-Energy Astrophysics is to a large extent a multimessenger
discipline, going well beyond the traditional realm of photon astronomy.
References
1. J.A. Carter, S.F. Sembay, Identifying XMM-Newton observations affected by solar wind charge
exchange—Part I. Astronomy and Astrophysics 489(2), 837–848 (2008)
2. D.H. Lumb et al., Charge coupled devices (CCDs) in X-ray astronomy. Experim. Astron. 2, 179
(1991)
56 3 Detection and Instrumentation in High-Energy Astrophysics
3. J. Cottam, F. Paerels, M. Mendez, Gravitationally redshifted absorption lines in the X-ray burst
spectra of a neutron star. Nature 420, 51 (2002)
4. See, for instance, https://www.open.edu/openlearn/science-maths-technology/telescopes-and-
spectrographs/-content-section-1.4
5. P. Murdin (ed.), Encyclopedia of Astronomy and Astrophysics (Nature Publishing Group, Lon-
don, 2001)
6. https://heasarc.gsfc.nasa.gov/docs/heasarc/headates/heahistory.html
7. M.J. Cieślak, K.A.A. Gamage, R. Glover, Coded-aperture imaging systems: Past, present and
future development. A review, Radiation Measurements 92, 59 (2016)
8. Paizis et al., Proceedings of the 5th INTEGRAL Workshop, The INTEGRAL Universe (2004),
https://sci.esa.int/web/integral/-/37398-ibis-isgri-observations-of-the-galactic-centre-region
9. J. Cortina, Highlights of the MAGIC Telescopes, in 32nd International Cosmic Ray Conference
(2011)
Chapter 4
Stellar Evolution up to the Final Stages
The nature of the observed stars has been a subject of discussion and speculation since
the early days of civilization. Atomists Leucippus and Democritus thought that the
Milky Way was made of stars, which they considered too small to be distinguished
from one another. By the time of the Indian mathematician Aryabhata (5th century
A.D.), there existed in the East the notion that stars were, in fact, other suns. It would
have been immediately obvious that they would have to be at enormous distances for
this hypothesis to make sense. Other important speculations were formulated in the
West. For example, in Giordano Bruno’s writings, not only were the stars identified as
distant suns, but inhabited planetary systems accompanied them, putting the author
on a direct collision course with the Roman Catholic Church. What is certain is
that it was only in the early 19th century, with the works of W. Herschel and J. von
Fraunhofer, that the star = Sun identification was shown to be correct: the absorption
lines of several nearby stars were observed, revealing their kinship with the lines
observed in the solar spectrum. This Chapter addesses the construction of stellar
models and the important features of Stellar Evolution till the final stages leading to
explosions/compact object formation.
Although the nature of stars remained for many centuries on a speculative plane,
scientists from Classical Antiquity devoted themselves to their study. The first catalog
of stars created in the West was authored by the mathematician and astronomer
Hipparchus, and contained some 850 stars observable by naked eye, as reproduced
in Ptolemy’s Almagest. Hipparchus and other later astronomers also noticed the
differences in brightness of the stars, and especially in their colors (Fig. 4.1). These
ancient observations and those recorded after the invention of the telescope in the
early 17th century led directly to the basic questions of stellar Astrophysics that
will be the subject of our discussion: Are stars “eternal”? What is their internal
constitution? How can these questions be linked with available observations? The
enormous development of the theory of Stellar Evolution throughout the 20th century
and the state of the art in this field will occupy the rest of this chapter and part of the
following chapters.
As already pointed out, the identification of lines in stellar spectra showed that stars
were objects of the same type as the Sun. The presence of known chemical elements
(H, C, O, etc.) and the study of spectral lines gives information about the outer stellar
region, while its inner composition remains undetermined because the radiation does
not carry information from the inner regions. But there are other ways to study stellar
structure, at least indirectly. For example, it is relatively easy to determine how much
energy is flowing from a star per unit frequency interval. In order to do this, it
is enough to use filters that let through photons in some chosen range, and then
count how many of them are in each wavelength interval. The total energy is easily
calculated with the aid of the relation E = hν. In general, and for “normal” stars
(the case of white dwarfs and others will be treated later), we find that there is a
band where the number of photons reaches a maximum, and that the spectrum has
approximately the shape of a black body spectrum. The black body, discussed by
G. Kirchhoff and others at the beginning of the 20th century, is an idealization that
applies to a perfect absorber/emitter which presents the distribution of intensities as
a function of frequency shown in Fig. 4.2.
The colors of the stars correspond to the maximum of the distribution, since the
photons with this wavelength are the most numerous, provided that the star complies
with the black body idealization. The value of λmax moves to lower wavelength
4.2 Basic Facts and Observations 59
values as the temperature increases. The lower the value of λmax (i.e., the higher the
frequency νmax ), the higher the temperature. Thus, the photons are said to be “harder”
(i.e., more energetic) for distributions where the effective temperature is higher. If we
restrict ourselves to the range marked V (visible) in Fig. 4.2, the stars must present
colors from red to blue, corresponding to temperatures between approximately 3800
and 10 000 K.
The next issue is the total energy emitted by the star, since we now have an
idea of how the photons are distributed. The black body emission problem remained
unsolved until the first years of the 20th century, as discussed in Chap. 2. In fact,
the functional shape of the curves in Fig. 4.2 corresponds to the expression (2.3),
a consequence of the discrete nature of light. Note, however, that the total flux that
emerges from a black body studied by G. Kirchhoff and others, i.e., the energy emitted
per unit time and per unit area, has a very simple form: the result is proportional to
the temperature to the fourth power, multiplied by a universal constant σ , and is
completely independent of the composition of the body:
F = σT4 . (4.1)
In particular, to calculate the total emerging flux it is enough to determine the stellar
temperature by finding the maximum emission. However, the flux is not the most
relevant quantity, since very different stars located at very different distances can lead
to the same flux. Thus, an independent estimate of the distance is essential to convert
the flux (relative) into luminosity (absolute), multiplying by the area of emission:
L = 4π R 2 × σ T 4 . (4.2)
Now we have assembled the basic framework that allowed E. Hertzsprung and H.N.
Russell to propose the diagram that bears their names and serves to classify the stars.
On the horizontal axis, we put the effective temperature of the emission, which is the
one in (4.1) and (4.2), indicating in which region the above-mentioned maximum
falls; and on the vertical axis, the luminosity (called power in Physics courses), a
60 4 Stellar Evolution up to the Final Stages
Fig. 4.3 Hertzsprung–Russell diagram built with data from the Gaia mission. Note that in addition
to temperature and luminosity, we could have used other variables common in Astronomy, such
as absolute magnitude and spectral class which are also equivalent to the former. For historical
reasons, astronomers put the temperature scale growing in the direction of the origin, that is, the
lowest temperatures are progressively farther from the origin. The location of regions containing
stars and those lacking them must be explained by the theory of Stellar Evolution which we will
study below. Credit: ESA/Gaia/DPAC; ESA/Hipparcos
measure of the total emitted energy which contains structural information of the star,
viz., the radius R in (4.2), but which can be determined independently by measuring
the flux and estimating the distance (the latter by various methods that feature dif-
ferent errors). Thus, the HR diagram in Fig. 4.3. can be constructed for all stars that
emit as black bodies [2].
Inspection of the HR diagram shows a quite remarkable fact: there are populated
and empty regions. Moreover, the vast majority of stars are located in a strip that
goes from the top left to the bottom right. Of course, there has to be some important
reason for the existence of this strip.
We can go even further without formulating any detailed description with the
following observation. If we draw a horizontal line indicating constant luminosity in
the upper half of the diagram, we see that there are stars for very different temperatures
that have the same luminosity. From the generic expression (4.1), we see that this
is only possible if the radii R are also very different, and in inverse relation to the
temperatures, because this is the only way to keep the product R 2 × T 4 constant. This
4.2 Basic Facts and Observations 61
justifies the denomination of giants and supergiants, depending on the value of the
luminosity, found in the astronomical literature, and begs a theoretical explanation
to understand why star radii are so different, even for the same star in different stages
of its evolution. A vertical line Teff = const. also reveals that there are stars with
luminosity that differ by several orders of magnitude, but show the same temperature.
This is only possible if the radius R is much larger in the more luminous stars. Thus,
we see that there is a lot of information in the HR diagram and a lot of work to be
done to achieve a physical description that explains how it is populated.
Before proceeding, we would like to point out another fact of importance: in
Ancient Greece, Hipparchus saw and catalogued essentially the same stars observed
in the sky today. The fact that stars have not undergone significant changes in over
2000 years shows that they are highly stable. More precisely, it shows that there is a
very stable state of equilibrium that makes them last much longer than a few millennia.
When we observe the night sky we are in the presence of a kind of “snapshot” of
the star population, just as we could get of any human population in a very crowded
place, for example in a public park. In the same image we would have examples
of several human “evolutionary stages” (babies, young people, adults, and elderly
people) and our task would be analogous to the study of human biology, which
aims to understand the aging process. Fortunately, the laws of Physics apply to our
problem, as we will discuss below.
The previous observation regarding the state of equilibrium and the fact that it can
sustain stars for many millions of years leads to the question of the kind of equilibrium
we are talking about. If we consider the case of a mass held up by a spring on
the surface of the Earth (Fig. 4.4 right), the mechanical balance of the system is
guaranteed by the condition Fgrav = Fspring . But if we imagine now that the mass
(in the form of a small cube) is an element of a fluid, the equivalent expression
Fgrav = Fpress points to two important considerations: first, the gravitational field is
not “external” as in the case of the little cube on the Earth’s surface, but rather it is the
very distribution of fluid that produces the gravitation, whence a star is often called a
self-gravitating fluid; and second, the whole fluid distribution is also responsible for
the force that sustains the “little cube” fluid element, by producing a pressure that
balances the gravitation.
When dealing with a self-gravitating fluid, the mechanical balance of forces is
called a hydrostatic balance. Free of any other forces, it is well known that the
fluid will adopt a spherical form (to minimize its free energy). Thus, the hydrostatic
equilibrium equation can be obtained by considering concentric shells of thickness
dr , where there is a pressure difference P(r ) − P(r + dr ) between the base and the
top of any given shell. On the other hand, the shell is subject to the gravitational force
that pulls it towards the center of the star (Fig. 4.5).
Now we can use calculus to express the forces in a simple way: the force produced
by the pressure difference is
62 4 Stellar Evolution up to the Final Stages
(a) (b)
∂P
P(r ) − P(r + dr ) ≈ − dr , (4.3)
∂r
and the gravitational pull is
G M(r )
Fgrav = −g(r )ρ(r )dr = −ρ dr , (4.4)
r2
where it is clear that the local acceleration due to gravitation increases as one moves
outwards owing to the accumulation of shells within, and should be calculated using
the same density ρ of the fluid. The condition Fgrav = Fpress leads immediately to
dP G M(r )ρ(r )
= −ρ . (4.5)
dr r2
Similarly, the continuity of mass indicates that, between radius r and radius r + dr ,
there is a mass difference dM = M(r + dr ) − M(r ) = 4πr 2 ρdr , i.e.,
dM
= 4πr 2 ρ(r ) . (4.6)
dr
The set of Eqs. (4.5) and (4.6) would be sufficient for the description if stars did not
generate energy, but we know that this is not what happens in general. Stars radiating
like black bodies for many millions of years need a source of energy, and this energy
4.3 Physical Description of Stellar Structure 63
−5 fusion ϐission
−6
E/ A ( MeV)
−7
−8
−9 56Fe
0 100 200
A
Fig. 4.6 Binding energy per nucleon as a function of the mass number A. The path followed
by nuclear fusion is indicated on the left. Energy is gained when light nuclei merge, which is
the mechanism used by stars. The fission of very heavy elements also delivers energy, since the
fragments are more bound than the progenitor nucleus. This is the mechanism triggered in nuclear
fission bombs (see [3] for more detail)
is related to the pressure that prevents collapse. Where exactly does this pressure
come from?
A classical example with point particles shows us that it is possible to obtain
energy by binding two or more particles that are initially in free states. In fact, if we
suppose that two particles of masses m 1 and m 2 attract each other with Newton’s
gravitational force, they can form a bound state with total mass M given by
m1m2
M = m1 + m2 − G < m1 + m2 . (4.7)
c2 r
Therefore, the process of binding the particles results in a lower total mass than the
initial one, since all the binding energies are negative. The excess mass multiplied by
c2 has to be expelled from the system into the environment in some form of energy,
tending to increase the total energy and pressure of the environment.
The previous example serves to illustrate a generic fact that has its physical real-
ization in the case of stars by means of nuclear forces. The accumulated empirical
knowledge of the most common nuclei and the Periodic Table in general allow us to
draw the important graph for stellar Astrophysics presented schematically in Fig. 4.6.
Since hydrogen is by far the most abundant element in the cosmos, and the easiest
to fuse (the repulsion due to the electric charges is the smallest possible for any
nuclei), let us start by studying what happens when two protons (hydrogen nuclei)
collide. We begin by changing to the center-of-mass frame and transforming the two-
body problem to a problem of a single body of reduced mass μ = m 1 m 2 /(m 1 + m 2 )
in a central potential, a trick already used in standard courses on classical Mechanics.
Note that in the case of two protons μ = m p /2, but the expression applies to any two
particles. In our case, the potential is due to the electric charge everywhere except at
64 4 Stellar Evolution up to the Final Stages
the closest range, where the (attractive) nuclear forces produce the “well” that allows
the two protons to bind in a fusion process (Fig. 4.7).
Classically, only particles with energy higher than the top of the Coulomb barrier
could fall into the attractive nuclear well region beyond. This far exceeds the average
kinetic energy ∼ kB T of the gas inside the Sun, and even the most energetic particles
of the Maxwell–Boltzmann thermal distribution inside it. For this reason, Lord Kelvin
sought the energy necessary to keep the Sun shining in the contraction of the solar
structure. But with the formulation of Quantum Physics a few decades later, there
was a way to justify the particles “crossing” the barrier with much lower energies,
using the so-called tunnel effect. There was thus a non-zero probability of getting
through, and the quantum formalism is used to calculate that. Here we will simply
state that this probability is
2π Z 1 Z 2 e2
P = exp − , (4.8)
v
where Z 1 and Z 2 are the particle charges, equal to unity for a proton (the hydrogen
nucleus), and v is the relative velocity of the two particles measured in the center-
of-mass system.
Defining the cross-section σ as the quotient of the number of reactions per particle,
divided by the number of incident particles per unit area, all per unit time, and
considering completely random collisions between the particles, the rate of fusion
events r will be
r = N1 N2 vσ . (4.9)
Note that this definition is not very rigorous, but suggests considering σ as the
“effective area” in which fusion reactions can take place. In (4.9), when the “1” and
“2” particles are the same, we obtain a factor N 2 typical of these processes, already
discussed in elementary kinetic theory. The next step is to evaluate the number of
particles that participate in the reaction between v and v + dv. Equation (4.9) actually
describes one collision, but there are many others, and to take all of them into account
we must integrate in order to sum over all velocities. As the quantity σ also depends
on the velocity, the generalization we require is
4.3 Physical Description of Stellar Structure 65
∞
r = N1 N2 v × σ (v) × φ(v)d3 v , (4.10)
0
where the classical Maxwell–Boltzmann distribution φ(v) gives the number of par-
ticles for each speed (or energy), and can be written as
3/2
μ
e−μv /kB T
2
φ(v)d3 v = 4π v 2 dv , (4.11)
2π kB T
where 4π v 2 dv reflects the fact that we assume isotropy of the distribution in velocity
space, i.e., independence of direction.
Now we need to take into account the probability that a particle of effective mass
μ can cross the Coulomb barrier, as described by (4.8) which must multiply the
integrand before the integration is done—this factor is not yet present in (4.10). But
first we must change variables from speed to energy, as the independent variable.
We begin with the cross-section, which must have the dimensions of area. In the
microphysical domain we can see that the only possibility is to have something like
π × λ2 , where λ is the de Broglie wavelength of the proton, proportional to 1/E for
non-relativistic particles. By separating the exponential and the factor 1/E, we can
introduce the so-called astrophysical factor S(E) to “hide” more things, but in the
hope that this will be almost constant with energy. Hence,
1
× e−b/E ,
1/2
σ (E) = S(E) × (4.12)
E
where
23/2 2 1/2
b= π μ Z 1 Z 2 e2 .
Inserting the probability factor (4.8), the total rate is
3/2 ∞
2 N1 N2 E b
r= S(E) × exp − + 1/2 dE . (4.13)
kB T (μm π )1/2 0 kB T E
The rate of reactions results from an integrand that has two exponential functions with
opposite behavior: if the energy increases, the exponential factor e−b/E decreases,
1/2
which is good because the tunnel effect makes it easier to “fall into the attractive
well”. But at the same time the number of particles with higher energies decreases
according to the first exponential e−E/kB T . If the energy is low, the opposite happens:
there are many particles, but crossing the barrier is very difficult for them. Thus,
there is a compromise between these two factors that indicates that both will be
important only in the neighborhoods of a certain energy which optimizes the value
of the integral. Below or abov this value the rate is virtually zero. This is the so-
called Gamow peak (Fig. 4.8). The “optimal” energy E 0 can be evaluated and results
in a value much greater than kB T , i.e., much higher than the energy of most of the
66 4 Stellar Evolution up to the Final Stages
Fig. 4.8 Region around the Gamow peak.The factors in (4.13) behave in opposite ways and the
result of the integral which determines the reaction rate is substantially different from zero only
around the optimal energy E 0 . Note that the former is generally beyond the peak of the Maxwell–
Boltzmann maximum
particles. The details of the integration are mathematically rather complicated, but the
important thing to remember is that, after evaluating how much energy is released
by the fusion of the two protons, the result can always be expressed in the form
ε(ρ, T ) = const. × ρ α T β , where ε is the energy released per gram of material per
unit time. This is the form that appears in textbooks and which stems from a simple
mathematical trick [4].
Before we continue with the formulation of the mathematical problem of stellar
structure, it is necessary to point out some important characteristics of hydrogen
fusion that are strongly determinant for Stellar Evolution. Normally, the exact form
of the transformation of hydrogen into helium is not discussed in textbooks. Rather,
a scheme of the type shown in Fig. 4.9 is presented, which is over-simplified and
hides the true nature of the fusion path.
To begin, let us focus on the first two protons that merge. A plain, simple fusion
is impossible. This is due to the fact that two protons do not have a bound state
(i.e., there is no bound “diproton”). Only one proton and one neutron have a bound
state (the deuteron, or deuterium nucleus written as 21 H, i.e., one proton and one
neutron). Thus, the reaction involving the four protons could not even begin if one
of the colliding protons did not become a neutron at the time of the collision. This
process, governed by the weak interactions, can be thought of as the formation and
immediate decay 22 He →21 H + e+ + νe , where the positron e+ and the neutrino νe
escape the reaction zone. This must be very rare, and in any case most of the time the
diproton 22 He, which is not bound, breaks down into two free protons again. Thus,
only in a system with a gigantic number of collisions can a certain minimal number
4.3 Physical Description of Stellar Structure 67
Fig. 4.10 Fundamental branch of the CNO cycle. A series of proton captures and decays on
carbon, nitrogen, and oxygen culminates in the production of 42 He from four “consumed” protons.
The carbon is restored in the last stage, and the resulting helium formed from the four protons that
enter the reactions on the left
of collisions produce the initial fusion. If we consider the probability that exactly at
the time of the collision a proton decays into a neutron, this would happen once in
7 × 109 yr. In other words, every second, one out of 2 × 1017 collisions will produce
a deuterium nucleus, while all others will result in two protons as before. This is a
major bottleneck not captured at all by simple pictures like the one in Fig. 4.9.
If we focus on the lucky deuterium that survived, the later sequence is a little
easier, the deuteron captures an additional proton and produces 32 He, a reaction that
is followed 86% of time by the capture of another proton that generates the final
4
2 He. This sequence is known as the p-p I cycle. There are two other possibilities,
today well understood, but the outcome is the same: conversion of four protons into
a helium nucleus. As a corollary of this discussion, we come upon a very important
fact: stars live for billions of years fusing hydrogen, a fact that would be impossible
if two protons had a bound state. This last hypothetical state would cause the stars
to explode in a few seconds, since only strong interactions would play a role in
fusion, without any need for decay. However, as this is not the case, it is the weak
interactions that determine the lifetimes of the stars, and it is not by chance that the
above-mentioned rate of once in 7 × 109 yr is of the same order as the time the stars
reside in the Main Sequence: almost no collisions will produce fusion, and the whole
process is regulated by the slow release of energy, instead of the explosive situation
that could have happened. We could say that we are “children” of inverse beta decay,
because without it there would be no Stellar Evolution, no heavy elements, and no
biology to use them and produce living organisms [5].
To complete this discussion, it is important to point out that there is another way
to fuse hydrogen into helium which requires the presence of heavier elements such as
carbon and oxygen and is called the CNO cycle. The CNO cycle is catalytic, because
the heavy elements are not consumed, but enter reactions and return untouched to
the environment. One of the branches (there are others of less importance) of the
CNO cycle is shown in Fig. 4.10 [6].
The importance of the CNO cycle lies in its dependence on temperature: while
the p-p cycle is weakly dependent on T , the CNO cycle grows very quickly with
it. Calculations indicate an exponent β = 1/4 for the p-p cycle, but β ≈ 16 for the
CNO cycle, in the parametrization pointed out above for the rate of energy release
68 4 Stellar Evolution up to the Final Stages
Fig. 4.11 Comparison of the energy generation between fusion in the p-p cycle (green) and the
CNO cycle (red). In the Sun, the former is dominant, but for stars that have twice the solar mass
or more, the CNO cycle dominates, since the central temperatures exceed about 1.8 × 107 K. The
possibility of carbon fusion via the triple-α process occurs only for much higher temperatures
ε(ρ, T ). This means that, as we consider stars of larger masses, with increasing cen-
tral temperatures, the CNO cycle will inevitably surpass the p-p cycle in importance.
Since dependence on the former is much more pronounced, stars that have the CNO
as their main mechanism of hydrogen fusion will expend energy much faster and
spend less time in the Main Sequence. The situation is shown in Fig. 4.11 and the
mass for which the conditions are reached is found to be slightly higher than 2M .
Thus, we have a physical reason to consider two branches in the main sequence: the
upper MS (M > 2M ) and the lower MS (M < 2M ), to which the Sun belongs.
The differences between the evolution of stars on the two branches will be discussed
later.
We can now resume our description and write, in an analogous way to equations
(4.5) and (4.6), an equation that relates the luminosity produced in a concentric shell
layer of thickness dr to the rate of energy release ε. Within this shell the luminosity
dL increases with the energy release according to
dL = εdr , (4.14)
and as before, the mass and radius differentials are related by dM = 4πr 2 ρdr , yield-
ing
dL
= 4πr 2 ρε , (4.15)
dr
which is coupled to (4.5) and (4.6) by the presence of the variable ρ and requires the
calculation of the nuclear energy release per unit mass ε on the right-hand side. But
there is another consideration regarding the release of energy in the stellar interior:
it is clear that this energy will move away from the initial point, thus establishing a
temperature profile: the only variable that does not yet appear explicitly in the set of
equations is precisely the temperature. Thus, it is reasonable to ask ourselves whether
this spatial distribution of temperature can be constructed from a fourth equation that
features the temperature gradient dT /dr .
4.3 Physical Description of Stellar Structure 69
An initial hypothesis for energy mobility would be that the flux of radiation leaving
the star from the inside, this being the main mechanism of energy transport, would
depend on this temperature gradient, as happens in any diffusive process. We assume
that the radiation collides very frequently with matter, so that each photon follows a
long and tortuous path out to the surface. Note that the stars are not in thermodynamic
equilibrium, since if they were they would not radiate net energy into space. But on
small scales, there is a local balance between radiation and matter which establishes
the temperature distribution T (r ) we are seeking, governed by gravitation and the
other ingredients inside.
Consider once again a shell with thickness dr . The difference in the radiation flux
that comes in through the base and the one that comes out through the top is
σ (T + dT )4 − σ T 4 ≈ 4σ T 3 dT , (4.16)
where we have linearized the Stefan–Boltzmann law, assuming that the temperature
difference is small. This difference across the layer can be written as dT = λ̄dT /dr ,
hence, proportional to the temperature gradient, where the mean free path of the
photons denoted by λ̄ is a quantity that contains an energy average over all the
photon–matter interaction processes, since it corresponds to a diffusion process.
Inserting this, the flux F is obtained in the form
dT
F = −4λ̄σ T 3 . (4.17)
dr
A more detailed statistical treatment shows that the mean free path can be replaced
by another variable κ̄ called the opacity, where λ̄ = (κ̄ρ)−1 . Furthermore, the flux
F is related to the luminosity in the layer by F = L/4πr 2 . Note that this is not to
be confused with the total luminosity of the HR diagram; it is the energy per unit of
time that passes through the inside layer. This latter quantity is the same as the flux
expressed in (4.17). Substituting in and extracting the temperature gradient dT /dr ,
we obtain [6]
dT 3 κ̄ρ L
=− . (4.18)
dr 16 σ T 3 4πr 2
In much the same way that we calculate the rate of nuclear reactions ε(ρ, T ) and
then plug it into the structure equations, we can do the same for the opacity κ̄,
which involves several physical processes that are more or less relevant for each
range of densities and temperatures. Four basic processes contribute to the opacity:
bound–bound processes (photoabsorption), bound–free processes (photoionization),
free–free processes (reverse bremsstrahlung), and Compton scattering, all of which
are shown graphically in Fig. 4.12.
The full calculation of opacities is complicated and requires the tools of Quantum
Mechanics. However, the averages for the first three can be expressed as κ̄ ∝ ρT −3.5 ,
the so-called Kramers form (although the coefficients turn out to be very different).
Compton scattering is a process that, at least in the limit of low energies (generally
70 4 Stellar Evolution up to the Final Stages
Fig. 4.12 The four basic processes that contribute to stellar opacity. Note that all of them are
ultimately “obstacles” to the outward progress of the photons, which are forced to exchange energy
and momentum with an electron, whether it is bound, as in cases (a) and (b), or free. In the first
three, the initial photon disappears, since they are absorption processes. In Compton scattering,
already discussed in Chap. 2, the photon is scattered and changes direction with a change in its
initial energy
dT dT
dr > dr ∣
ad
dT
cools
dr
heats
Fig. 4.13 Putting a pan on the stove is similar to heating the base of a star layer. The temperature
gradient dT /dr will grow until it exceeds the so-called adiabatic gradient dT /dr |ad , and the con-
vective instability starts to produce regular movements (central panel). If we continue to increase
the gradient, there will eventually be a transition to the turbulent regime (right panel)
where on the right-hand side we have used the very definition of the adiabatic gra-
dient dT /dr |ad . This is perhaps too crude, and another improvement widely used in
Astrophysics is the so-called mixing-length theory, where the hypothesis that con-
vection carries heat by means of a “typical” convective bubble leads to a non-linear
expression of the flux. Finally, there are other models for calculating the convective
flux using a full distribution of bubble sizes, and these are even more complicated.
There will be no need to go into all the details of this modeling here, but it is important
to remember that Schwarzschild’s criterion dT /dr < dT /dr |ad must be monitored
layer by layer in the construction of the star model, and when it is not satisfied, (4.18)
must be replaced by (4.19) or something better in order to describe the convective
region.
The system of coupled equations (4.5), (4.6), (4.15), and (4.18) [or (4.19) if the
convective regime has been reached], with the stellar boundary conditions
M(r = 0) = 0 , (4.20)
L(r = 0) = 0 , (4.21)
P(r = R) = 0 , (4.22)
T (r = R) = 0 , (4.23)
are the set that allows us to generate the stellar models that we will present below, then
compare their characteristics with real star observations. In other words, a star model
is the solution yielding the four functions P(r ), M(r ), T (r ), and L(r ), satisfying
a constraint P(ρ) called the equation of state and giving the constitutive functions
κ̄ and ε. The latter are functions of temperature and density. The last condition
may seem a little strange since we should impose T (r = R) = Teff , but this greatly
72 4 Stellar Evolution up to the Final Stages
complicates the numerical calculations and the form (4.23) is preferred. The effective
temperature can then be calculated a posteriori when the model is solved.
As with any other mathematical problem, it is important to know the general
characteristics of the solutions. The “well behaved” solutions are unique and math-
ematically stable. In fact, in the theory of Stellar Evolution, there is a result called
the Russell–Vogt theorem, although it is in fact only a conjecture, since it has never
been demonstrated: for a given mass and composition, the solution of the structure
equations is unique. With some reservations, we can therefore go ahead with the
guarantee of being in the presence of a well formulated problem with physically
acceptable solutions.
We are now in a position to discuss the problem of stellar stability raised at the
beginning of this chapter. We begin (4.5) and (4.6), which ensure the hydrostatic
(mechanical) equilibrium. A simple manipulation of (4.5) consists in multiplying
both sides by the volume V = 4πr 3 /3 and integrating over the mass, i.e.,
4 3 dP 1 GM
πr dM = − dM , (4.24)
3 dM 3 r
Here the first term on the right is zero because, on the surface S, the pressure is zero,
V
and in the center the volume V is zero. Thus, we have G = −3 0 S PdV . If we
transform the integral from the volume to mass using the fact that dV = dM/ρ, the
general result is
M
P
G = −3 dM . (4.26)
0 ρ
However, in the specific case of an ideal gas, P = nkB T = (ρ/μm H )kB T , and we can
identify the internal energy u per unit mass as the average kinetic energy per particle
3/2kB T divided by the mass (μm H , where μ is the average molecular weight). Thus,
u = 3P/2ρ, and the Virial theorem for an ideal gas is
G = −2U , (4.27)
with U = u dM the total internal energy of the star.
The Virial theorem contains the physical basis required to explain in simple terms
how a star “works”. The first important thing is that, in hydrostatic balance where
there is no large-scale movement of the gas, the total energy of the star is E tot =
G + U . The Virial relationship tells us immediately that E tot = G /2 < 0. The
star is bound (negative total energy) and distributes its energy according to (4.27) as
4.4 Some General Considerations about Stellar Evolution 73
That is, in a few hours or so, a star can restore the hydrostatic balance, since the fluid
is in fast communication. It is hard to imagine the hydrostatic balance being violated
in stars, regardless of the type of perturbation.
The second timescale is the so-called thermal or diffusive timescale, telling us how
long it takes for the star to establish a stationary temperature distribution if perturbed.
Assuming that diffusion of photons is the dominant mechanism, with coefficient of
diffusion D, we have
2
R2 κ ρ̄ R
τdiff ≈ = 105 yr , (4.29)
D κ ρ̄ R
where we have used the fact that D = c/κ ρ̄ for the diffusion of photons through the
stellar material. This is much longer than τdyn , although quite short by astronomical
standards.
74 4 Stellar Evolution up to the Final Stages
2
G G M2 R L M
τK–H ≈ = = 1.5 × 107 yr . (4.30)
2L 2R L R L M
The fourth and last timescale is the nuclear timescale, the time during which the
fusion reactions maintain the brightness of the star. If in each reaction a fraction
ξ ∼ 0.007 of the rest mass is released (this number is valid for H → He), and a
fraction f ≥ 0.5 of the star’s mass is available for fusion, the nuclear timescale is
Mc2 M L
τnuc ≈ f ξ = 1010 yr . (4.31)
L M L
Thus, the required fundamental hierarchy, the one that makes stars stable and very
long-lived, is
τdyn τdiff τK–H τnuc . (4.32)
We see that the time that governs the lifetime of a star is the nuclear time (more
precisely, the weak interaction decay time described above as a “bottleneck” for
fusion reactions), and stars can be considered in hydrostatic and thermal balance in
any situation. This is why the lifetime of the stars is so long. Only when some of these
inequalities are violated during a star’s evolution will there be important changes.
Therefore, we may state that the life of a star consists of long periods of stationary
equilibrium separated by moments when the hierarchy in (4.32) is violated and the
star is forced to seek a new stationary state or phase in its evolution [4].
What we consider low or high mass stems from the different possible types of physical
behavior that we will now define. We have already pointed out that around 2M , the
CNO cycle begins to dominate the p-p, whereupon the production of nuclear energy
is greatly accelerated due to its stronger dependence on temperature. This defines the
point of separation between the stars of the lower Main Sequence, to which the Sun
belongs, and those of the upper main sequence. But there is another important limit
around 8M : after the ignition of helium to produce carbon, stars of lower mass will
never be able to fuse the resulting carbon, so their evolution stops there. Those with
higher masses (especially above ∼ 10M ) go through a succession of nuclear cycles
right up to the burning of silicon and are called high mass stars. The less massive
ones, with M ≤ 8M , are thus said to be low mass, with evolution and fate similar
to our Sun [2, 4, 6].
4.5 Stellar Evolution: Low Mass Stars 75
Fig. 4.14 Visual comparison between three stars, showing the convective and radiative zones.
The smallest, up to ∼ 0.3M , shown on the right, is fully convective from the outside in. This
convective envelope recedes for higher masses (center). The high mass star produces almost all its
helium through the CNO cycle, the temperature profile is much more abrupt and as a consequence
it is the central core that becomes convective. Around 20M the convective zone occupies almost
the entire star from the inside out
The life of a star begins effectively when a cloud of gas contracts and the center of
the cloud reaches density and temperature conditions sufficient to ignite the hydrogen
fusion reactions discussed previously. This stage is never actually reached for masses
smaller than 0.08M (about 85 times the mass of Jupiter), so these brown dwarfs will
never convert to stars and will only emit a small part of their original energy content
while cooling indefinitely. The core of stars above this lower limit will be totally
convective up to about 0.3M , at which mass they begin to satisfy Schwarzschild’s
criterion in the center and the fraction of the mass occupied by this radiative core
grows for stars of progressively higher masses. For the Sun, this radiative core extends
to about 70% of the radius, and only the envelope undergoes convection, as shown
in Fig. 4.14.
Low-mass stars spend several billion years located almost at the same point in
the HR diagram. Their lifetime can be estimated empirically, knowing that there is a
mass-to-light relation for the Main Sequence, viz., L ∝ M 3.5 . As the basic estimate
is simply τMS ≈ E nuc /L, and expressing all in terms of solar quantities, we have
immediately
τMS M 2/5
= . (4.33)
τ M
In other words, very low mass stars will continue on the Main Sequence for many
billions years, while more massive ones should live a few tens of millions in this
situation. (Note that this relationship is also valid for high mass stars, those with
shorter lives on the MS.) The situation can be compared to what happens with a car
and its fuel: an economical car depletes its fuel very slowly to run more kilometers
(that is, it “lives” longer). But a sports car uses its fuel very quickly, because it is
not made to do otherwise, so its “lifetime” is much shorter (until it is refuelled,
something which the stars cannot do). This is what happens with stars as a function
of their mass: they enjoy a long or short life in accordance with (4.33).
76 4 Stellar Evolution up to the Final Stages
PCE
ME
MC
P=0
Fig. 4.15 Sketch of the core–envelope configuration for a low-mass star. The pressure PCE should
be the same on both sides of the core–envelope interface, but calculations indicate that the core
reaches a maximum pressure beyond which it cannot remain in equilibrium with the envelope when
only a small fraction of the total original hydrogen has been consumed by the fusion
Life on the Main Sequence does not provoke any notable structural changes in
the star, but rather a slow differentiation of the central core which is enriching itself
with helium while being depleted of hydrogen. So how long can a star go on fusing
hydrogen and maintain itself in the MS? The answer seems evident, but it hides
surprises. In high-mass stars, the convective core homogenizes the composition and
fusion ends when all the hydrogen is used up, as would be expected. But in low-mass
stars, the central core is changing composition because helium is produced, precisely
from the fusion that consumes the hydrogen. Thus, the core is gradually becoming
different from the envelope (Fig. 4.15). The presence of increasing amounts of inert
helium in the core produces an unsustainable situation for this configuration in the
long run: the pressure inside the core must support the envelope situated above it,
i.e., the pressure PCE must be the same on both sides of the core–envelope interface.
But as this pressure consists of two terms with different signals, it has a maximum
as a function of the total mass of the core MC . When this maximum is reached, by
constant addition of helium, hydrostatic equilibrium is no longer possible and the star
must seek a new stationary equilibrium because it is no longer possible to support
the envelope on top.
As a function of the composition, this condition can be written as
MC μE 2
= 0.37 , (4.34)
M∗ μC
where μC and μE are the molecular weights in the core and envelope, respectively.
Numerically, the core can no longer support the envelope when MC ∼ 0.1M∗ , or
when about 13% of the star’s original hydrogen has been consumed. Therefore, there
is still a lot of hydrogen in the envelope of a low-mass star when the fusion stops, in
contrast to the high-mass case. This result is called the Schoenberg–Chandrasekhar
limit, not to be confused with the Chandrasekhar limit, which refers to something
else (see Chap. 6). It shows that the reason for low-mass stars exiting from the MS
is structural, due to the impossibility of remaining in hydrostatic equilibrium, and
4.5 Stellar Evolution: Low Mass Stars 77
is not due to the exhaustion of hydrogen in the star. In fact more than 80% of the
original hydrogen will still be present in it, although all outside the central core [4].
This contribution by the Brazilian scientist Mário Schoenberg is of great importance
in the context of Stellar Evolution, and is among the best results produced by the
then young University of São Paulo in 1942.
Taking into account the fact that the (thermal) energy content is still very large
inside the star, there must be a quasi-hydrostatic readjustment, that is, without col-
lapse, controlled by the Kelvin–Helmholtz timescale. In other words, the star is in
search of a new stable equilibrium, as we said before, since the inequalities (4.32) are
violated at the moment of reaching the Schoenberg–Chandrasekhar limit. The core
is now inert (with no nuclear reactions) and the p-p (or CNO) cycle does not really
stop, but occurs in a spherical shell around this inert core. At this point, the envelope
expands greatly, while the core contracts and heats up. What is the physical reason
for this behavior, and what explains the star’s displacement in the HR diagram?
A quantitative and consistent explanation can be found by considering that both
the Virial ratio (4.27) and the total energy must be conserved simultaneously. For
times τ τdyn , the Virial energy distribution leads to G + 2 U = 0, where
indicates the spatial average in the star composed of the core plus envelope. On the
other hand, conservation of the total energy requires
t t
G + U − Ldt + ε dt dV = constant . (4.35)
0 0
G MC2 G MC Menv
G = + = constant ,
RC R∗
to obtain the variation of the total stellar radius R∗ with the radius of the core RC , by
differentiating this last expression and rearranging the factors:
2
dR∗ MC R∗
=− . (4.36)
dRC Menv RC
The stellar radius increases as the core radius shrinks (negative sign) amplified by
the large squared factor. As a result of this expansion the star moves, with little
change in luminosity, to lower effective temperatures, since if L = σ T 4 × 4π R 2 is
practically constant and the radius R needs to grow as indicated in Fig. 4.16, the
effective temperature T needs to decrease to ensure the constancy of the product.
This happens until the surface is cold enough to allow the formation of the hydrogen
ion H − , technically an anion with two electrons and a proton. The key issue is that
the opacity of the H − ion is very high, and this therefore amounts to “stopping”
78 4 Stellar Evolution up to the Final Stages
the flow of radiation from the inside at the temperature at which it forms. As a
consequence, the star increases in luminosity L, rising through the giant branch
(Fig. 4.17, trajectory in red).
Eventually, this path takes the star to the top of the giant branch, marked “c” in
Fig. 4.17, where its radius is 100–200 times the radius it had in the Main Sequence,
and its luminosity is about 1000 times greater. At this point the material in the core
of a low-mass star, which has continued contracting and heating during this process,
generally becomes degenerate, i.e., it is no longer an ordinary gas: the density is so
high that the dominant component of the pressure is due to the electrons satisfying
Pauli’s exclusion principle. Although we shall discuss degenerate gases in more
detail in a later Chapter, we may establish its basic description now, since it is a very
important concept for the evolution of stars.
We are accustomed to think that a gas has a pressure which increases with tem-
perature. This is true for a classical gas, where thermal agitation is responsible for
the pressure. However, if we consider matter of increasing densities, there will be a
moment where electrons can only occupy energy states of a maximum of two per
each “little cube” in the phase space spanned by the position variable x and the
momentum variable p. This cube now has a volume of the order of the cube of the
Planck constant, i.e., ( × p)3 ∼ 3 . The gas changes regime and is no longer gov-
4.5 Stellar Evolution: Low Mass Stars 79
Energy
Δx
Δp
Fig. 4.18 Left: A little cube of the phase space where only two electrons per state can fit, with
opposing spins. Right: A classical gas with a lot of accessible states gives way to a degenerate gas,
where only two electrons can occupy each energy state
erned by Classical Physics, but enters the (quantum) degenerate regime. Basically,
this means that, as we “squeeze” the electrons further, the principles determining the
new source of pressure are the uncertainty principle, which determines the dimen-
sions of the phase space, and Pauli’s principle, which states that there can be no more
than two electrons occupying one of the little cubes, and then only when they have
opposite spin numbers. Meanwhile, thermal agitation is no longer important, but not
because the temperature is low (it is very high when measured in K), but because
degeneracy provides a much higher pressure. The situation is illustrated in Fig. 4.18.
We will soon see that the pressure can be calculated in a simple way in the new
degenerate regime, since the pressure is just the derivative of the energy per particle
with respect to the volume, that is, it describes how matter reacts if we try to compress
it. The results, for the case of non-relativistic and ultra-relativistic electrons are:
2 5/3
P→ n (non-relativistic electrons) , (4.37)
m
2
P → n 4/3 (ultra-relativistic electrons) , (4.38)
m
where n = N /V is the number of particles (electrons) per unit volume. From these
expressions it can be checked that this degenerate pressure is of purely quantum
origin: if the Planck constant were zero it would not exist. The passage from one
regime to the other is shown schematically in Fig. 4.19.
Going back to our previous discussion, the most important fact is that the pressure
of a degenerate gas does not depend on the temperature. When helium ignition begins
in the core (above about 108 K), the core will not “react”, in the sense that the pressure
will not increase by the release of helium fusion energy. This fusion, therefore, gets
out of control, since without an increase in pressure the star will not expand to bring
the temperature down. This phenomenon is called the helium flash, and releases an
internal luminosity many orders of magnitude greater than the initial one. However,
since this release is buried deep down and occurs very quickly, it does not produce
pronounced observable consequences. In fact there is a “hunt” for the stars that may
be undergoing the flash: those that are located at the top of the giant branch.
As a result of the thermal instability that leads to the flash, the star finally releases
so much energy internally that it comes out of the degenerate regime. The pressure
80 4 Stellar Evolution up to the Final Stages
n = N/V
is once again dominated by the temperature and the star finally expands. This makes
it possible to control the helium nuclear reactions known as the triple-alpha process,
already shown on the right in Fig. 4.11. This helium fusion cycle also has “hidden
secrets”. To begin with, it is a cycle that could not happen if it were necessary for three
helium nuclei to meet simultaneously, since this would be almost impossible. As with
the reaction of two protons with decay in the p–p cycle, the initial reaction is 2 4 He →
8
Be, and at any instant, there is 1 beryllium nucleus for every billion helium nuclei.
However, the nucleus 8 Be is highly unstable, and in fact there are no stable elements
in the Periodic Table between A = 5 and A = 8. Thus, the beryllium nucleus decays
rather quickly, although it lives much longer than a simple random collision, and
there would therefore be time to capture a third helium, making 8 Be+4 He, were it
not for the fact that this reaction is prohibited by the conservation of fundamental
quantities, in this case the parity of the states. It was F. Hoyle who reasoned that there
must be an excited carbon nucleus to complete the reaction, serving as a “doorway”
for the arrival and then decay into ordinary (ground state) carbon. Indeed, without a
doorway state the reaction would not occur and there would be no carbon from stars
(nor human beings made from it). This excited carbon state was found soon after.
When the decay is completed, we can write
4
He → 12 C + γ , (4.39)
although it should be remembered that this is just a shorthand for something much
more complicated [6]. Performing the calculations to find the energy released by
each reaction, we arrive at the expression ε(ρ, T ) = constant × ρ α T β , with α = 2
and β = 41. The dependence on temperature is extreme! We may thus expect helium
exhaustion to occur much faster than in the case of hydrogen. Stationary fusion by
means of the triple-α process causes the star to “descend” from the giant branch
and establish itself in the so-called horizontal branch if its metallicity is low (or
red clump, if its metallicity is high). Here, the core produces carbon from helium,
while the fusion of hydrogen into helium continues in the surrounding spherical shell
(Fig. 4.20).
4.5 Stellar Evolution: Low Mass Stars 81
Fig. 4.20 Trajectories of a 1M star in the ascending and descending giant branch, to establish
itself in the “helium Main Sequence”, as indicated. Exit from the horizontal branch or red clump
occurs for the same physical reasons as those discussed earlier for the Main Sequence, now applied
to the helium fusion reaction
In much the same way as the star’s core reached the Schoenberg–Chandrasekhar
limit to leave the MS, there is an analogous limit for the carbon-rich inert core.
For the same physical reasons as already described, the star should now ascend to
the so-called asymptotic giant branch (AGB). This time thermal instability occurs
in the concentric shell, although it is not degenerate, and leads to thermal pulses
which end up expelling the envelope and producing the beautiful images that we
have of planetary nebulae (Fig. 4.21), leaving behind the carbon enriched core and
also oxygen: the capture of α particles by carbon is inevitable, and becomes more
important for higher masses, near the upper end of the range we are considering here,
viz., 8M .
82 4 Stellar Evolution up to the Final Stages
This leftover C–O core, initially very hot, will cool down over several Gyr to
become a cold white dwarf, something we shall study in a following chapter. Finally,
we would like to point out that there is observational evidence of white dwarf pro-
duction by stars that had higher mass than 7.5M in the Main Sequence. In the range
closest to this upper limit, the cores suffer more α captures and end up with a com-
position of C–O in approximately equal parts (and in a degenerate state, even before
the thermal pulses). Heavier compositions are possible for the most extreme WDs.
We have discussed the evolution of low-mass stars, and we can now turn to the high-
mass ones, already defined as those that exceed roughly 8M . The dividing line
between the two groups is given by the mass where carbon cannot be fused because
the temperature in the central region does not reach the ignition value of around
8 × 108 K. But it is important to emphasize that what we have here is a completely
new ingredient in stellar evolution, and one that effectively prevents this ignition:
the emission of neutrinos that is triggered at temperatures of this order, and cools
the region that could otherwise fuse very efficiently. Thus, in order to fuse carbon at
these temperatures, the reaction rate must exceed the core’s neutrino emission rate, a
condition that recurs for all the fusion reactions that follow. Moreover, thanks to the
neutrinos, the cores and envelopes now follow different paths, as though they can no
longer see each other. Technically, we speak of thermal decoupling. The thermally
decoupled core, supported by electrons is degenerate and to a good approximation
satisfies the condition [6]
T3
= constant , (4.40)
ρ
for core densities ρ ≈ 105 g cm−3 . Then, provided that the mass of the star is suffi-
cient, the “inert” carbon that has accumulated finally fuses according to the reaction
∗
12
C + 12 C → 24
Mg , (4.41)
∗
where the excited magnesium nucleus 24 Mg decays in many different ways
(remembering the case of the excited carbon proposed by Hoyle) which need to
be added together to obtain the final reaction rate proportional to T 29 . This cycle of
carbon fusion lasts substantially less than the triple-α, and when exhausted it makes
room for a mechanism of energy generation that is not exactly fusion, but rather a
rearrangement of “clusters”, referred to as photo-disintegration of Ne. In this pro-
cess, the neon nuclei are broken by photons according to 20 Ne + γ ↔ 16 O + α, and
the α particles are soon captured in reactions of the type 20 Ne + α → 24 Mg + γ . If
we look at the initial and final states, we can write effectively
4.6 Stellar Evolution: High Mass Stars 83
“ 20 Ne + 20 Ne → 16 O + 24 Mg ” , (4.42)
where the quotation marks remind us that this is not really the fusion of two neon
nuclei. Shortly afterwards, for densities ρ > 5 × 106 g cm−3 and T ∼ 2 × 109 K,
oxygen can ignite, with an initial reaction
∗
16
O + 16 O → 32 S , (4.43)
and analogously to (4.41), the excited sulphur nucleus decays to a number of possible
final states whose integrated reaction rate is proportional to T 35 . Very soon after that,
another photodisintegration reaction (and the last) takes place from 28 Si:
“ 28 Si + 28 Si → 56 Fe + γ ” , (4.44)
where once again 28 Si denotes here a set of clusters with that mass number, but which
is far from constituting a true silicon nucleus. Similarly, 56 Fe is a way of writing a
series of elements with that mass number which immediately capture α particles and
decay to form a distribution known as the peak elements of the iron group. What
happens afterwards will be the subject of further study, in the collapse stage and the
subsequent explosion [6].
The result of the whole sequence of reactions in the star structure leads to the
onion structure shown in Fig. 4.22. It is important to understand that the time in
which the star is sustained by each of the cycles gets rapidly shorter due once again
to the vigorous expenditure of the available energy. These times are listed in Table 4.1
for a star of 20M . The survival of the star depends on the fuel reservoir that remains
in each case, but gets shorter and shorter, culminating in the supernova events to be
described in the following Chapter.
In summary, the high-mass stars follow quite different trajectories from those of
the low-mass stars in the HR diagram, as consequence of some important physical
facts: initial energy production by the CNO cycle, convective cores, and tempera-
tures/densities high enough to continue using the product nuclei for energy produc-
tion. Some facts not discussed here, such as substantial mass loss, cause the HR
diagram to shift to the left, and yellow or blue star explosions, instead of the red
supergiants in the lower range up to about 15M (Fig. 4.23). We will see later how,
in some well studied cases (the SN 1987A, for example), there is evidence for this
and other complex characteristics that are currently under review.
An important insight into the stellar Physics of the most massive stars can be
achieved by simultaneously imposing the Virial theorem and hydrostatic equilibrium,
which determine the existence of solutions for the simplest stellar models. Using
these concepts we can show how it is possible to characterize the end of the stellar
sequence by observing the behavior of the gas and radiation pressures, and we shall
see that the extracted value coincides, within uncertainties, with the maximum mass
directly inferred for actual stars.
As a starting point, the hydrostatic equilibrium equation (4.5) states physically
how gravity and pressure work against each other, showing that the long-lasting
84 4 Stellar Evolution up to the Final Stages
Fig. 4.22 Onion structure of a massive star that has already ignited all possible cycles to maintain
itself and that has developed a core of “Fe” at the center. From [7]
Table 4.1 The different cycles of thermonuclear burning, together with the main products, ignition
temperatures, and lifetimes for a star of 20M
Fuel Main product Secondary products Temperature Duration of the
[109 K] cycle [year]
H He 14 Ni 0.02 2 × 107
He C, O 18 O, 22 Ne 0.2 106
C Ne, Mg Na 0.8 103
Ne O, Mg Al, P 1.5 3
O Si, S Cl, Ar, K, Ca 2.0 0.8
Si Fe Ti, V, Cr, Ni, Mn, Co 3.5 < 1 week
equilibrium solutions we call stars stem from an exact balance between the two. To
make the situation even more transparent, we can define a “gravitational pressure” (a
purely formal quantity) using the very general definition of pressure as the derivative
of energy with respect to volume, viz.,
∂ EG
PG = − ,
∂V
and writing the gravitational energy in the form
3 4π 1/3 G M 2
EG = − .
5 3 V 1/3
4.6 Stellar Evolution: High Mass Stars 85
Fig. 4.23 Trajectories in the HR diagram for (non-rotating) stars of different masses and solar com-
position [8]. Note that due to the convective core the Schoenberg–Chandrasekhar limit is irrelevant
and there is no ascent to the giant branch or helium flash. The mass loss is substantial and produces
a return to blue for higher-mass stars (around 18M and beyond), until the moment of collapse and
explosion (Chap. 5), indicated by stars in the diagram for each case. Credit: J.H. Groh et al., Astron.
Astrophys. 558, A131 (2013), reproduced with permission © ESO
This procedure will be used again to address the question of the existence of neutron
stars (see Sect. 6.3.3).
The other fundamental ingredient is the Virial relation G + 2U = 0. To tell us
about the end of the high mass stellar solutions, we use the latter together with the
hydrostatic equilibrium equation written in the form
1/3
G 4π
Pi = PG , where PG = C M 2/3 ρ 4/3 , C = ,
5 3
obtained by taking the derivative of (6.36). To show this, we recognize that the
two physical ingredients in the pressure are the gas pressure and an increasingly
important radiation pressure Prad ∝ T 4 , that is Ptot = Pgas + Prad . When we consider
increasingly high star masses, the radiation pressure is initially small and can be
neglected. On the other hand, the temperature gradient of the star grows to the point
at which a convective adiabatic interior is achieved. In fact, at around 20M , detailed
calculations show that the convective structure occupies almost the whole star.
The problem of convective adiabatic equilibrium was addressed by Lord Kelvin
a century ago. His reasoning was that the trajectories of stellar gas elements must
at any point obey the relation P = Kρ 5/3 , and therefore, since the fluid must be in
equilibrium at all points inside, this mimics a polytropic relation of index 3/2, i.e., it
has the same functional form of the latter. The threshold of the adiabatic convective
condition means that the gradient reaches the adiabatic value
P dT ∂ ln T
∇ad = ≡ (4.45)
T dP ∂ ln P
86 4 Stellar Evolution up to the Final Stages
Thus for a constant value of the exponent, we can integrate (4.45) with the result
In this way we see that the adiabatic convective model due to Jeans is supported
by a gas component that satisfies Pgas = Kρ 5/3 . The existence of stellar models
is guaranteed as long as Ptot ≈ Pgas because the slopes of Ptot and “PG ” ∝ ρ 4/3
are different, and Ptot = PG can be satisfied (Fig. 4.24). However, when the mass
grows, the radiation pressure grows as T 4 , which is very fast, and the slope of the
total pressure Ptot bends towards a slope of 4/3. The curves of Ptot and PG become
parallel and solutions (stars) cease to exist. This is expected to happen somewhere
in the range 100–200M and shows exactly how radiation pressure destabilizes the
stellar structure when it dominates the total pressure.
As a corollary we believe that there is a maximum stellar mass determined by
this condition, and empirical evidence has been sought to confirm this. Figure 4.25
shows some of the highest reported values, which do indeed confirm the general idea
of there being an end to the stellar sequence. Accurate mass determinations becomes
difficult for the highest values, since the photosphere is not well-located in wind
environments. However, it is fairly safe to say that there is a simple reason for the
maximum mass a star can have, even if its exact value remains somewhat uncertain.
References 87
Fig. 4.25 Some of the largest stellar masses determined today. R136a1 was at first believed to be a
single object of mass around 1000M , but is now known to be a triple system with a component of
320M . AG Car with mass around 100M and WR102ka with mass 150M are other examples
of very massive stars. The progenitor of SN1987A with an initial estimated mass of about 19M is
also shown
References
1. S. Gregory, M. Zeilik, Introductory Astronomy and Astrophysics (Ed. Brooks Cole, London,
1997)
2. E. Bohm-Vitense, Introduction to Stellar Astrophysics: Vol. 1: Basic Stellar Observations and
Data (Cambridge University Press, Cambridge UK, 1989)
3. C.A. Bertulani, P. Danielewicz, Introduction to Nuclear Reactions (CRC Press, Boca Raton
USA, 2004)
4. D. Ostlie, B.W. Carroll, An Introduction to Modern Stellar Astrophysics (Pearson, New York,
1995)
5. G. Marx, Life in the nuclear valley. Phys. Edu. 36, 375 (2001)
6. D. Clayton, Principles of Stellar Evolution and Nucleosynthesis (University Chicago Press,
Chicago, 1984)
7. M. Spurio, Probes of Multimessenger Astrophysics. Charged Cosmic Rays, Neutrinos, γ -rays
and Gravitational Waves (Springer, Berlin, 2018)
8. J.H. Groh et al., Fundamental properties of core-collapse supernova and GRB progenitors:
predicting the look of massive stars before death. Astron. Astrophys. 558, A131 (2013)
Chapter 5
Supernovae
The visual recognition of the first supernovae (now called “historical”) in the West
dates back to the Middle Ages. Previously, supernovae that exploded in the years 185
A.D. and 393 A.D. had been visible, but were not recorded in the chronicles of the
time. There is no clear evidence of these events in the West, although contemporary
research has associated some accounts of religious authors with these events. The
1006 A.D. supernova, well documented by Chinese astronomers, would have caused
bewilderment in Medieval Europe, then dominated by the Aristotelian dogma of the
immutability of the heavens, but the event was not clearly recorded. A few decades
later, the 1054 A.D. supernova (which today we know gave rise to the Crab Nebula)
was recorded and studied in both East and West. We review the various supernova
types, their basic Physics and related issues. Finally, and largely due to the scientific
revolution, the supernovae observed and studied by Tycho (1572 A.D., Fig. 5.1) and
Kepler (1604 A.D.) were widely discussed and the first steps were taken toward an
understanding of this phenomenon [1], a task which is still ongoing, within a good
overall picture, as we shall see in a moment.
In search of the physical origin of these events, the modern pioneers were the
astronomers W. Baade and F. Zwicky. In a 1934 article [2] they realized that there
were very large differences between the observed energy scales of the “novas” and
the “super-novas”, as they called them in these original writings, until then included
in the same group. Baade and Zwicky were the authors of the first classification
proposal, based on the presence or absence of hydrogen lines. According to them,
the absence of hydrogen in the spectrum was a sign of an evolved star, possibly from
the so-called population II (old), while if hydrogen was present, this could imply a
population I star (young). Later it became clear that some “type I” events belonged
to the young disk population, even though the energy scale was very similar in both
cases (around 1051 erg). This forced a refinement of the classification. Soon, more
astrophysicists built models that had as protagonists a massive white dwarf in a
binary system (type Ia), or a massive star collapse, but where also He or Si could be
Fig. 5.1 Left: Tycho Brahe’s original notes, recording the evolution of the observed brightness
of the supernova in 1572 A.D., identified by the arrow at the top of the image. Right: False color
mosaic image of the Tycho remnant in X-ray (Chandra) and infrared (Spitzer), showing the different
chemical elements synthesized. It is now believed that the event was type Ia, of thermonuclear origin
(see text). Credit: X-ray NASA/CXC/SAO, infrared NASA/JPL-Caltech, optical MPIA, Calar Alto,
O. Krause et al.
Fig. 5.2 Supernova classification according to the presence or absence of lines in the spectra (left).
Envelope loss in the pre-supernova phases and the binary nature of the system are supposed to cause
massive stars to produce type I supernovae, although in fact these also correspond to gravitational
collapses. On the right, three spectra representing types Ia, Ic, and Ib from top to bottom, indicating
some important lines. Credit: Swinburne University of Technology
absent for evolutionary reasons (called types Ib and Ic, Fig. 5.2), or where the whole
envelope had to be ejected, i.e., with a massive presence of hydrogen (these were
eventually called type II).
A number of studies conducted throughout the 20th century have shown that
the explosion rates differ in different kinds of galaxy. For example, SNII are rarely
observed in elliptical galaxies, which is thought to be due to the fact that these galaxies
have no substantial stellar formation and contain few young stars. We can define
1 SNu (arbitrary unit) as the number of supernovae observed per century and per
1010 L . We could then obtain the total rate in the Milky Way by simply multiplying
the SNu observed in similar galaxies (those of Sb type) and noting that the Milky
5.1 Supernova Types and Classification 91
The physical focus of collapse is on the behavior of the “Fe” core, where the
quotes serve as a reminder that a variety of nuclides with a mass number close
to 56 are included. As we have seen previously (Fig. 4.6), the binding energy is
maximum near this value. The production of the “Fe” core must end the possible
fusion reactions, since it is impossible to obtain fusion energy by fusing the elements
of the iron peak. The growth of the core mass also has an absolute limit: the source
of its pressure is degenerate electrons, which means that it can only grow to a value
known as the Chandrasekhar mass. Without considering Coulomb, relativistic, and
finite temperature corrections (the latter are quite important), this is given by
2
Ye
MCh = 1.46 M . (5.1)
0.5
Near this maximum, the central density and temperature are ρc ∼ 5 × 109 g cm−3
and Tc ∼ 7 × 109 K ≈ 0.7 MeV. The stability of the core is possible until it reaches
this maximum. When this happens, two physical effects conspire to destabilize it.
The first is that the density has increased so much that electrons are captured by the
“Fe” nuclei in reactions of the type
e− + A → (A − 1) , (5.2)
that is, the number of particles sustaining the pressure decreases, so the pressure
decreases. The second effect is that the thermal contribution of the pressure also
decreases because energy is being used to break the “Fe” nuclei (photodisintegration)
in the reaction
56
Fe ←→ 13α + 4n , (5.3)
an effect that goes in the same direction of decreasing the total pressure. As a conse-
quence, the core collapses, the density increases, and this accelerates the e− captures
in an irreversible process [3].
When the density reaches values about 100 times higher than the initial one, a
unique phenomenon in the contemporary Universe occurs in the collapsing core,
related to the escape of neutrinos. In the early stages of collapse, neutrinos were able
to escape unimpeded, taking away energy from the central region. But despite their
having a very small cross-section, of the order of 10−44 cm2 , more than 20 orders of
magnitude smaller than an electron, for example, their trajectories will be affected
by the increase in density. In fact, reactions of the type
ν + ν̄ → e+ + e− , ν + A → ν + A , (5.4)
occur more frequently and cause the neutrino mean free path to decrease to the
order of the radius of the collapsing core R. Numerically, the condition n × σ ×
R ≈ 1 yields, for R ≈ 10 km and the neutrino cross-section σ , a density ρT ≈ 4 ×
1011 g cm−3 above which neutrinos are retained in the collapsing core, i.e., they go
5.2 Supernovae and Gravitational Collapse (Types II, Ib, and Ic) 93
from the free escape regime to the diffusive regime. This value is called the trapping
density. Trapping causes the total number of electrons + neutrinos per baryon (YL ) to
remain constant at the initial numerical value of YL ≈ 0.37. There is no more energy
flowing out of the core and the process continues adiabatically from the trapping
point onwards.
Studies of the collapsing core agree that the (Newtonian) equation of motion
admits homologous solutions, in which the speed of the material elements is propor-
tional to the radius at which they are found. This behavior is observed in simulations.
If we call α(t) the constant of proportionality, a function of time, we have
u α̇
= = constant . (5.5)
r α
But of course the whole core cannot be in this homologous regime: as we consider
matter at greater distances from the center by increasing the r coordinate, the inward
speed of the matter u increases, and at some point it must satisfy
u + cs = 0 , (5.6)
where cs is the speed of sound in the collapsing matter—at this point, an appreciable
fraction of the speed of light. This means that matter in the inner region can maintain
causal communication and maintain homology, but in the outer regions this is not
possible. The point where the condition (5.6) is satisfied is called the sonic point,
illustrated in Fig. 5.3.
From these considerations we see that, when the matter in the outer core continues
to fall as a result of the strong gravity, a shock is formed at the sonic point (not at
the center!) which encloses a mass of about 0.6M . This interior core becomes very
hard when the saturation density ρ0 ∼ 2.7 × 1014 g cm−3 is reached, which breaks
Fig. 5.3 The sonic point separates the collapsing core in an inner region (R < Rs ) and an outer
region (R > Rs ). The homology relation can only be maintained in the inner region, while in the
outer region the matter remains in almost free fall. This causal separation is important for the final
destiny of the core
94 5 Supernovae
Fig. 5.4 Left: Analogy to visualize the effect of the sudden stiffening of matter in the inner core.
The falling material forms a shock wave at the relevant “sonic point”, similar to the reversal of
the momentum of a particle that collides with a wall. Right: Shock wave calculated for a model
of 15M with a core of mass about 1.5M . The shock wave starts with positive velocity (upper
curve) and advances towards the surface, but loses intensity rapidly and reverses itself after about
10–20 ms (lower curves with negative velocity) due to the energy losses discussed in the text
up all nuclear structure. It means that the matter falling onto it from above bounces
when it strikes the edge of the inner core (Fig. 5.4), thus ending the homology.
Although no one doubts this sequence of events today, the most obvious expected
outcome, namely, that the shock manages to eject the envelope and that this is the
cause of the explosion, is not really what happens. As shown in Fig. 5.4 (right), the
shock wave loses intensity as it travels outwards, mainly because the falling matter
it meets on the way is still composed of nuclei: the outer core is no denser than ρ0 .
The point is that dissociating these infalling nuclei costs the shock 1.8 × 1051 erg
for every 0.1M crossed [the process described in (5.3)]. Since the simplest energy
balance shows that the initial energy of the shock corresponds to the binding energy
of the inner core (IC), i.e.,
we have a paradoxical situation: the initial energy is more than enough to explode
the star, but it is wasted in breaking the nuclei of the outer core, with mass > 0.6M
for almost any value of the progenitor mass.
After a series of studies that finally demonstrated the unfeasibility of the shock
as a successful mechanism for the explosion, attention was focused on the fate of
the core, now becoming a proto-neutron star, but at serious risk of collapsing to a
black hole unless something happens soon after the shock stops. The key to this
further evolution seems to be the energy that was released by compactification, and
residing (because of the very high temperatures) in a sea of neutrinos, produced much
more efficiently than photons under such conditions. A total of around 1053 erg
is retained in the whole collapsed core, and it leaks away with difficulty as the
condition n × σ × R > 1 is still satisfied, mainly because R is now quasi-stationary
at 20–30 km, but the cross-sections of the neutrinos with matter grow a lot with
temperature T ∼ 1 MeV ∼ 1010 K. The neutrinos diffuse out of the proto-neutron
5.2 Supernovae and Gravitational Collapse (Types II, Ib, and Ic) 95
star on a diffusive timescale of about 1 s. Under the above conditions, the situation
is analogous to the diffusion of photons in the solar interior, a regime that ends
when they reach the optical depth surface τ < 1, as described in Chap. 4. Thus,
neutrinos are absorbed, re-emitted, and scattered many times until they finally reach
the neutrinosphere, defined analogously to the photosphere, as if they were the Sun’s
photons. In the simplest hypothesis, the proto-NS emits like a black body, but for
neutrinos, with a luminosity
7
Lν = (4π R 2 )σ Tν4 erg/s , (5.8)
4
a situation illustrated in Fig. 5.5. The factor 7/4 stems from the fermionic character
of the leaking neutrinos and expresses the difference with an ordinary black body.
Although the density of the material between the neutrinosphere and the stationary
shock is much lower (≤1013 g cm−3 , this region is sometimes called a quasi-vacuum
precisely because of this sharp density difference), the rate of capture of the neutrino
flux is not negligible. Because of the processes
νe + n → p + e − , (5.9)
ν̄e + p → n + e+ , (5.10)
there is a region of gain (before the shock) and loss (after the shock). The effective
capture rate can be estimated as
Lν
Q̇ = Fσ E ν = E ν3 , (5.11)
E ν
and thus depends on the average energy of the neutrinos E ν , the total luminosity
L ν , and the detailed form of the spectrum (the average of the cube of the energy
E ν3 ). These captures do not “push” the shock by transferring energy, but instead
create conditions for it to expand again. This is why the mechanism is known as
the neutrino revival. This would appear to be the cause of the expansion at the base
of the envelope, with the explosion as its final outcome. We must insist that the
impulse transfer to the shock by captured neutrinos is very small, but its dynamic
96 5 Supernovae
effect, allowing a hydrodynamic wind solution, is possible (i.e., with finite pressure
at infinity) [4]. Core-collapse supernovae could quite properly be called neutrino
bombs.
In the last 400 years, the nearest supernova of this type was SN1987A, in the Large
Magellanic Cloud. This event is shown in Fig. 5.6. The supernova progenitor was
identified in pre-existing images, and had a mass of the order of 18–19M according
to its position in the HR diagram. A complete reconstruction of its evolutionary
history suggests that the progenitor lived about 11 Myr, left the MS about 700 000 yr
ago, turning into a red supergiant soon afterwards, with a radius about three times
the orbit of the Earth around the Sun, exhausted its helium, and then lit its carbon
about 10 000 yr ago (when humankind was just taking up agriculture), burnt neon
from 1971 until 1983, oxygen from 1983 until February 1987, silicon for about 10
days in 1987, and finally exploded on 23 February of that year. All these periods
are subject to some uncertainty, but the basics are believed to be correct. The event
allowed us to begin the observation of neutrinos as a new discipline, and the details
that were reconstructed will be presented in Chap. 9.
We must stress that, although we believe that the shock revives as indicated,
the influence of “new” Physics (e.g., deconfinement of quarks subject to very high
pressure in the center) cannot yet be discarded [5]. In any case, nature does not care
much about this, and continues to make high-mass stars explode, as happened in
SN1987A and other frequently recorded collapse events. The are strong indications
that hydrodynamical instabilities in 3D are crucial for the success of the explosions.
Finally, it should be pointed out that the processes described here are essentially
the same as those that operate in the case of type Ib (without hydrogen) and type Ic
(without hydrogen or helium) supernovae. That is, the explosion mechanism is the
same even though the envelopes of the progenitors have been diminished or even
eliminated by the binary partner or stellar winds that are commonly observed in
high-mass stars. We will see later that there are concrete cases of supernovae that
reveal very high mass loss, and light curves indicative of these phenomena. A second
important issue is that there is evidence to suggest that collapsing stars at the lower
5.2 Supernovae and Gravitational Collapse (Types II, Ib, and Ic) 97
end of the mass range do not develop an “iron” core. The reason is that in the 8–10M
range, nuclear reactions cannot take the star beyond the formation of lighter cores,
composed of O, Mg, and Ne in degenerate conditions. Thus, when they reach their
corresponding Chandrasekhar mass, electronic capture causes them to collapse with
a practically invariant mass of about 1.38M . Throughout this collapse, the presence
of oxygen ignited at T ∼ 2 × 109 K results in events that look like a thermonuclear
supernova (see below), but with the formation of neutron stars with a low and fixed
mass (around 1.25M ), which results from the loss of around 10% of the original
1.38M by the radiated neutrinos. Since there are many stars in this mass range,
there must be several examples in the samples we have. A number of papers have
suggested that the explosion that gave rise to the Crab pulsar in 1054 A.D. was of
this type, with little luminosity and the production of the homonym pulsar. As a
corollary, a “peak” is expected in the neutron star mass distribution at 1.25M , so
establishing its presence should provide important support for these ideas.
Finally, not all core-collapse explosions have given rise to a pulsar. There are hot
sources (Central Compact Objects, or CCOs) in some of them which do not pulse,
and this is attributed to a combination of low magnetic field and low rotation rate.
The “classic” explanation that the pulsar beams may point away from us does not
seem to match the statistics. In other words, the neutron stars “should be there", but
are not really detected.
In the pioneering studies of Baade and Zwicky in the 1930s, all “super-novas”
belonged to the same class, although there was a difference that would become
important for the refined classification: the presence or absence of hydrogen in the
spectrum, as previously pointed out. As work progressed, it was observed that the
occurrence of supernovae without hydrogen (and without silicon, corresponding to
what we know as type Ia) pointed to an old population, not only because of the
above-mentioned absence of hydrogen (characteristic of an evolved progenitor), but
also because they were not, on average, located in the plane of the galaxies like
the young stars (so-called Population I stars). Thus, they began to think about what
kind of evolved progenitor (Population II stars) could suddenly release an enormous
amount of energy (the roughly 1051 erg observed) and what process would allow this
release.
The analysis of possible energy sources led them to consider the uncontrolled
fusion of carbon into a white dwarf as the most viable mechanism that satisfied the
given conditions. But still, there were (and there are) two scenarios for fusion of the
carbon: the white dwarf could ignite the carbon by the long term accretion effect from
a “normal” companion, or it could also happen in binary systems, in the final stage
when two white dwarfs merge and matter compresses and heats up. The first scenario
is described as single-degenerate and the second as double-degenerate, names that
98 5 Supernovae
Donor
White dwarf
Accretion disk
Fig. 5.7 The two possible scenarios that would allow carbon ignition in a white dwarf. On the left,
the single-degenerate case, where a white dwarf accretes matter from a normal post-MS companion,
and on the right, the double-degenerate case, where two white dwarfs end up merging after a long
time, when their orbit decays. From [6]
Fig. 5.8 Combustion front δ, supposedly locally flat and of small thickness compared with the
dimensions of the system. Region 1 contains the fuel (in this case, carbon) and region 2 the fusion
products. u is the speed of propagation of the front in the unburnt medium of region 1
correspond to the number of (degenerate) white dwarfs involved. Figure 5.7 illustrates
these two scenarios.
Studies of carbon ignition in the single-degenerate case still present several uncer-
tainties. It is a common assumption that a single white dwarf will not be able to reach
these conditions unless it is near the Chandrasekhar limit. The ignition temperature
for densities around 109 g cm−3 (appropriate for the center under these conditions) is
above 5 × 108 K. Since oxygen has an ignition temperature about five times higher,
only the carbon component of the white dwarf can fuse under these conditions (see
Chap. 6). Once the carbon is ignited, the combustion front should spread more or less
spherically to the surface. This type of phenomenon can be described by considering
the conservation of physical quantities, as we will see below (Fig. 5.8).
To formulate the problem, we can write down the equations of conserved mass
flux, energy, and momentum through the surface δ with the result [7]
5.3 Thermonuclear Supernovae 99
where ω = + P/ρ is the enthalpy function per unit mass, v1x and v2x are the normal
speeds at the surface, and P and ρ are the pressures and densities of the two fluids.
Equations (5.12)–(5.14) are written in the system moving with the front δ in the
Newtonian approach, and since they merely express conservation laws, they must be
satisfied by any form of combustion.
Using the first two equations above and defining the volume V = 1/n, we imme-
diately arrive at
P2 − P1
j2 = , (5.15)
V1 − V2
which relates the mass flow to the quantities on each side of the front δ. Finally,
substituting in (5.12)–(5.14) and arranging, we have
1
ω1 − ω2 + (P2 − P1 )(V1 + V2 ) = 0 , (5.16)
2
known as the Chapman–Jouguet (CJ) adiabat. It differs from Hugoniot’s adiabat—
which describes a discontinuity without combustion and without chemical change of
the gas, i.e., a shock—because the enthalpy function ω is different on each side. If the
gas did not change state we would have ω1 = ω2 . However, for combustions ω1 = ω2
because the fluid behind the front δ, in the burnt region 2, is in general a different
function. The CJ adiabats are curves specified by two parameters: if we know the
state of the gas before burning, specified by P1 , V1 , we can determine P2 , V2 given
the mass flux j (Fig. 5.9).
As the mass flux j is positive, we need either P2 > P1 and V1 > V2 , or alternatively
P2 < P1 and V1 < V2 . The first solution leads to the upper branch of the CJ adiabat
and the combustions are called detonations; the second possibility is achieved on
the lower (quasi-horizontal) branch, between points A and B in Fig. 5.9, where the
combustions are called deflagrations (ordinary combustions). In the intermediate
gray region the mass flux is imaginary, and therefore these solutions do not exist in
nature.
The question is now: what is the mode of combustion in type Ia supernovae?
Physically, the two paths are quite different. Ordinary deflagrations happen when the
heat released in the reaction zone inside the front δ diffuses and helps burn the fuel
ahead (think, for example, of setting fire to a sheet of paper). The basic quantity that
determines the spreading is the thermometric conductivity, given by χ = κth /CP ρ,
where κth is the thermal conductivity and CP is the heat capacity at constant pressure.
Using this, we can estimate the width of the flame by δ = (χ τ )1/2 , where τ is the
characteristic reaction time and the speed is of the order of u def ∼ δ/τ ≈ (χ /τ )1/2 .
Although the latter is slow in everyday life (again, think of the sheet of paper), it can
be very fast, but will always be subsonic in a white dwarf. In contrast, the detonations
of the upper branch of Fig. 5.9 are ultimately mediated by a shock which “burns”
the particles behind it. The shocks are always supersonic in region 1 ahead of the
flame—think what happens when you are standing waiting for the subway, and a blast
of wind suddenly hits you: this is the shock produced by the train when it travels
through the tunnel; people standing at the platform are not “warned” because it is
supersonic, and it arrives even before the mechanical sound of the train. Thus, the
medium in region 1 cannot expand before the flame reaches it, and the combustion
is always total, never intermediate or partial [6, 8].
The key to determining which mode occurs in a star lies once again in the obser-
vations. SNIa events have light curves compatible with the production (synthesis) of
at least 0.6M of nickel, but also 0.2–0.3M of Si, Ar, Ca, and S, i.e., elements of
intermediate mass that are “partially burned ashes”, as happens with embers that did
not completely burn in a barbecue. Less common elements such as 54 Fe and 58 Ni
that would spoil the observed nucleosynthesis cannot be produced in any quantity,
since their actual observed abundance is very small.
Since detonations take place in conditions of nuclear statistical equilibrium, where
the number of protons and neutrons does not change in the reactions (there is no
time for this), and only when this equilibrium is violated can combustion leave
intermediate elements as a result, we deduce that there must be at least one stage
of the combustion that is in the deflagration mode. Thus, the propagating front can
“warn” the matter ahead (since it is subsonic). The front therefore expands because
of the waves that travel ahead, and a partial combustion is possible. But on the other
hand, there is the need to have abundant nickel, easily produced by detonations. This
gives rise to the idea that the combustions may begin as deflagrations, then “jump” to
the detonation branch when the instabilities of the flame deform it and increase the
combustion rate. These instabilities can be seen in any film where the protagonists
explode gasoline or something similar: the calculation of the front (using Euler’s
equations with reaction terms) is shown in Fig. 5.10 for a thermonuclear supernova.
The mechanism that transforms a deflagration into detonation is not well iden-
tified in this problem, but these changes have already been observed and studied
in laboratories. In supernovae, this model is called the deflagration-to-detonation
5.3 Thermonuclear Supernovae 101
Fig. 5.10 Calculation showing the instability action (Landau–Darrieus, Rayleigh–Taylor, and oth-
ers) in front of carbon combustion at four different moments: 0.3 s, 0.9 s, 1.2 s, and 1.4 s after the
beginning of the combustion in the center of the star, on the left [9]. Credit: Alan Calder and Dean
Townsley
Fig. 5.11 DDT model. At the top of the figure, the white dwarf lights carbon and the front spreads
like a deflagration until the instabilities force it to “jump” to the detonation branch. Initially, inter-
mediate mass elements are produced, followed by a large amount of nickel [10]
Fig. 5.12 Light curve of a SNIa with the main characteristics that determine its shape identified
explicitly. Without energy sources, the temporal decay would be very fast and incompatible with
observations [8]
Fig. 5.13 Left: Search for the surviving companion star [11]. Right: The Kepler remnant (still
disputed as an SNIa explosion). No candidates have been identified within the circles marking the
central region of the explosion [12]. © AAS. Reproduced with permission. Credit: P. Ruiz-Lapuente
the expanding envelope, equals the time scale of the expansion, and the subsequent
escape of the gamma photons from the remnant are indicated in Fig. 5.12.
One last point worth highlighting is the series of recent attempts to directly deter-
mine whether supernovae correspond to the single-degenerate or double-degenerate
scenario. There are some ways to investigate this possibility by observing the histor-
ical type Ia supernova regions for which there is no doubt as to their nature (SN1006
and Tycho, see Table 5.1). The simplest way is to look, close to the center of the
explosion, for some star altered by the shock’s passage. This star should be partially
swept, and if so, suspected to be the one that would transfer mass until the carbon
ignited (Fig. 5.7 left). The search in Tycho’s remnant only revealed one candidate,
possibly a halo star, unrelated to the explosion region that is there by chance (back-
ground). In the Kepler remnant no candidate was found (Fig. 5.13). Thus we could
conclude that there is no direct evidence of the single-degenerate scenario and that
5.3 Thermonuclear Supernovae 103
the explosions must have been produced by a merging of two white dwarfs. However,
analysis of a third remnant, 3C 397, from some 2000 years ago, by the Suzaku satel-
lite showed that this remnant must have been produced by the explosion of a single
white dwarf. This is due to the observation of the amount of nickel, magnesium,
iron, and chromium and comparison of the models, which are quite different in the
two cases and which excludes a binary system of two white dwarfs as progenitor for
3C 397. Thus emerges the idea that the two scenarios could produce SNIa, although
most of them were due to WD–WD binaries.
The fact that supernovae reach negative absolute magnitudes indicates that, in the
hypothesis of being able to find a standardization for the light curve, they would
be excellent “rulers” for measuring very large distances, since they are seen up
to cosmological scales. This observation is the basis of the work that sought to
systematize the observations and convert supernovae into tools for Cosmology. For
example, with a large telescope that can detect a visual magnitude around 25 it would
be possible to see a supernova up to z ∼ 3, that is, whose light was emitted when the
Universe was just a quarter of its current scale. The record belongs to the detection
of a supernova that exploded when the Universe was less than 3 Gyr old [14].
If we collect the light curves of type Ia supernovae, identified as such by their
spectra, there is a variety that seems indisputable (Fig. 5.14). But since they are
cosmological sources, we must apply the corrections of the cosmological model to
correct both the observed duration and luminosity. When this procedure is carried
out, the curves converge to a universal form (Fig. 5.14), and astrophysicists speak
of a calibration (known as the Hamuy–Phillips calibration, [15]). The result is that
the maximum luminosity is only a function of the temporal width: the wider the
curve, the brighter the object. This is interpreted as showing the universality of the
Fig. 5.14 Light curves of several SNIa for different redshifts (left), and the shape obtained after
applying the corrections (right). This procedure is known as the Hamuy–Phillips calibration. Credit:
Calan–Tololo SNIa Survey
104 5 Supernovae
light curves of explosions: all differences are due to Cosmology. Thus, if we observe
more distant supernovae, we can know what they are like intrinsically (by applying
the calibration) and with it test the cosmological model. Supernovae thus became
“standard candles” for Cosmology.
It was precisely the implementation of these ideas that led two independent
teams—the Supernova Cosmology Project and the High-Z Supernova Search Team—
to announce that the data favored a model in which the expansion of the Universe is
accelerating. The reasoning is easy to follow: the supernova data “calibrated” with
the correction of the “standard” cosmological model used until then show that the
supernovae are systematically fainter for greater distances. Thus, the simplest solu-
tion is to think that the Universe has expanded faster and dragged supernovae to
greater distances. If the model of the Universe does not have this extra expansion,
supernovae cannot be placed in it (Fig. 5.15).
Note that so far we have not questioned the cause of the acceleration, which is
a separate problem. It has only been stated that the expectation of the magnitudes
of the SNIa according to the decelerated models does not match the observed data.
For an everyday analogy, we can imagine that we have identical lanterns and several
carriers at different distances: by measuring the flux and knowing how much they
emit intrinsically, we can calculate the distances for each one. This is exactly what
the research teams did with the measurements announced some 20 years ago, and
which are even firmer now that a much larger and better studied sample has become
available.
To end this discussion, we know that the simplest cosmological hypothesis to
explain this acceleration is to introduce a cosmological constant that acts dynami-
cally as a repulsion imposing a positive acceleration on the Universe. But other possi-
bilities are being considered, for example, modified gravitation and extra dimensions,
among the best known. There have also been attempts to challenge the interpretation
of the observations, suggesting that there could be an extinction of the magnitudes
over such distances. This is a very complicated hypothesis because the known extinc-
tion always depends on the wavelength, while the measurements indicate that it is
5.4 Type Ia Supernovae and Cosmology 105
universal (independent of λ) in the case of SNIa (one of the teams made this working
hypothesis public at the time of the announcement).
Finally, we should point out that thermonuclear supernova modeling now intro-
duces a somewhat worrying element into the discussion: if the two scenarios (single-
degenerate and double-degenerate) can produce events, and furthermore WD–WD
binaries are the majority, why would the light curves be identical? Would a large dis-
persion be expected between them, visible in the calibration on the right of Fig. 5.14
after the cosmological correction has been applied? Much theoretical and observa-
tional work will be required to answer this question.
The recognition of the existence of supernovae with energies much higher than the
“standard” value of around 1051 erg, and even higher than the “hypernovae” (type
Ic supernovae) that reach up to 1052 erg, dates back to the first decade of the 21st
century [16]. Until that moment, the classification already discussed worked very
well, with the exception of having to include SNIc with wide lines, associated with
the occurrence of a gamma-ray burst (see Chap. 11). But progenitors of mass greater
than 10M seemed to explode according to the given classification, while the lighter
ones were supposed to lead to the electron-capture events mentioned at the end of
Sect. 5.2, and there were no major problems on the horizon, at least from this point
of view (Fig. 5.16).
However, it became clear that some events with “anomalous” light curves should
correspond to explosion energies of up to 1053 erg, without it being clear how this
happened. These were referred to as superluminous supernovae (SLSN), defined
empirically as ones where the absolute optical magnitude is less than −21, i.e., more
luminous, as indicated by their more negative magnitude. We will soon examine our
present theoretical understanding of these events.
There are three basic models in the literature for SLSN explosions. Pair instability
supernovae, the collision of ejection with material in the circumstellar medium, and
the injection of energy by a magnetar—a highly magnetized neutron star, discussed
in Chap. 6. We will give a brief description of each of them below.
Pair instability is an expected phenomenon in the extreme circumstances present
in very high mass stars. Motivated by studies of stars with masses of several hundred
solar masses at very low metallicity (called population III by astronomers), possibly
associated with the first star generations when the first structures formed in the Uni-
verse, there was interest in understanding their evolution and corresponding nucle-
osynthesis. At the most advanced stage, a star with mass in the range 80–100M or
above with a helium core can convert photon energy into pairs e+ and e− , and taking
into account that radiation makes a substantial contribution to the total pressure, this
can cause the star to collapse. Throughout the collapse, there is explosive fusion of
carbon and oxygen and the release of energy is more than enough to unbind the star,
producing an enormous amount of 56 Ni. The observations of some SLSN require at
106 5 Supernovae
least 10M of nickel, a result that seems possible for “zero metallicity” stars between
150M and 250M [19]. It could also happen for higher metallicities if the mass
loss is suppressed, for example, by high magnetic fields. A variant of this scenario
is that, depending on the mass, pair instability can lead to mass ejection pulses of
around 10M , whence one would observe a supernova that lasts several years rather
than exploding in one go. Supernovae inside planetary nebulas (SNIPs) have been
associated with the “slow decay” events of the SLSN Ic light curves in Fig. 5.17.
Fig. 5.16 Spectra of collapse events (lower part) and thermonuclear supernovae (upper part),
showing the structure of the parents (right). Electron-capture events do not appear in this figure
[17]. Credit: M. Modjaz
Fig. 5.17 Light curves of several SLSN. The curves decay over a few months or even longer, in
some cases even of order 1 yr [18]. Credit: M. Fraser
5.5 Superluminous Supernovae 107
where P0 is the initial period (of the order of 1 ms, otherwise the magnetic field B
could not grow enough) and
2 2
1014 G P0
τ0 = 0.6 d
B 1 ms
is an injection timescale. We see that, if the field grows, there will be enough injected
energy to explain the light curves. A more detailed comparison is shown in Fig. 5.18.
In a first approximation, all supernovae are essentially point explosions that later
expand in the surrounding interstellar medium (ISM), regardless of the specific type.
We can thus formulate a general description of the expansion evolution, provided that
some simplifications are admitted. One of these hypotheses is that the density of the
interstellar medium can be described as a power law of the distance ρISM = ρ0 r −k ,
which is quite realistic in cases, for example, of mass loss in the pre-supernova stages,
and includes the case of ρISM = ρ0 = constant, when k = 0. Thus, the total mass of
the remaining MSNR grows as the remnant sweeps the ISM according to the formula
RSNR
4π
MSNR = Mej + 4πr 2 ρISM dr = Mej + ρ0 RSNR
3−k
, (5.18)
0 3−k
for k < 3. From (5.18) we see immediately that, when the remnant reaches a radius
of approximately 1 pc, the swept mass is comparable to the mass ejected from the
SN, denoted Mej , almost independently of the value of k. Observations indicate, on
the other hand, the injection of a “standard” energy from the explosion of E exp =
1051 erg, which will be our reference below. In the early stages of the explosion,
MSNR ≈ Mej , the energy losses of the remnant due to radiation are small (i.e., the
expansion is essentially adiabatic), and the solution of the equations of motion in
this phase of free expansion is simply
ṘSNR = 2E exp /Mej ≡ vSNR , RSNR = t 2E exp /Mej .
As time goes by, we have already said that the swept mass grows until it equals
Mej and the approximations used so far are no longer valid. The instant at which
MSNR ≈ Mej occurs is
1/3
Mej n ISM −1/3
RSNR = 4.8 pc , (5.21)
10M 1 cm−3
5.6 Expansion of Supernova Remnants in the Interstellar Medium 109
5/6 −1/2
Mej n ISM −1/3 E exp
t = 1.4 × 10 3
yr , (5.22)
10M 1 cm−3 1051 erg
hence between 1 and 2 millennia after the explosion. In this new stage after free
expansion, the internal energy of the gas increases at the expense of the kinetic
energy of the SNR, and one should thus consider the internal pressure
3 −1
4π RSNR
Pint = (γint − 1)Uint ,
3
where Uint is the internal energy, the difference between the E exp and the kinetic
energy MSNR vSNR
2
/2 of the SNR. One usually makes the thin shell approximation
here, assuming that the whole mass is concentrated in one thin shell, and assumes
strong shock conditions, where the “density jump” across the shock is maximum,
i.e., ρSNR = 4ρISM [7]. The equations of motion
d(Mv)
= 4π R 2 Pint , (5.23)
dt
1 L0
Uint = E − Mv 2 + −1 , (5.24)
2 τ0 + t −1
can be solved. The last term of (5.24) represents an injection of energy from the
inside, for example, by a neutron star in a very fast and magnetized rotation, as we
saw in the last section. Without this last term, the system can be solved by proposing
a power law for the radius of the remnant, and then determining the exponent. These
solutions were already obtained in the classical works of Sedov and Taylor in the
1950s [23]. In this Sedov–Taylor phase, the radius increases as
1/5
50E exp
RSNR = t 2/5 .
9πρISM
Physically, we find that the thin shell decelerates before the gas that comes soon
after, generating an internal shock that propagates inwards, called the reverse shock
(Fig. 5.19). The reverse shock heats up the interior gas and this energy ends up
110 5 Supernovae
Finally, after the snow-plow stage, which lasts about 106 yr, the shell breaks into
fragments that mix with the ISM gas, which has an observed root mean square (rms)
speed of about 20 km/s. Clearly the remnant ceases to be visible here, and as we have
pointed out before, it is even possible that this disappearance happens long before
this kinematic fusion. Almost all the remnants detected so far (more than 200) are
in the S–T stage, although those that gave rise to strongly magnetized neutron stars
(magnetars) may receive additional energy and appear to be older than they really
are because of the last term of (5.24). Figure 5.20 presents a scheme of successive
exponents in each phase of expansion that summarizes this discussion [24].
5.6 Expansion of Supernova Remnants in the Interstellar Medium 111
Fig. 5.20 Exponents of the supernova remnant evolution through the free expansion, Sedov–Taylor,
and snow-plow stages. The approximate values for the transition radius between these stages are
indicated on the vertical axis for ISM standard density n ISM = 1 cm−3
References
1. D.H. Clark, F.R. Stephenson, The Historical Supernovae (Pergamon, London, 1977)
2. W. Baade, F. Zwicky, On super-novae. Proc. Natl. Academy Sci. USA 20, 254 (1934)
3. H.A. Bethe, Supernova Theory (World Scientific, Singapore, 1994)
4. J.W. Murphy, A. Burrows, Criteria for core-collapse supernova explosions by the neutrino
mechanism. Astrophys. J. 688, 1159 (2008)
5. O.G. Benvenuto, J.E. Horvath, Evidence for strange matter in supernovae? Phys. Rev. Lett. 63,
716 (1989)
6. P. Hoeflich, Explosion Physics of thermonuclear supernovae and their signatures, in Handbook
of Supernovae, eds. A.W. Alsabti and P. Murdin (Springer, Berlin, 2017), p. 1151
7. L.D. Landau, E.M. Lifshitz, Fluid Mechanics (Pergamon Press, Oxford, 2013)
8. D. Branch, J.C. Wheeler, Supernova Explosions (Springer, Berlin, 2017)
9. A.C. Calder, B.K. Krueger, A.P. Jackson, D.M. Townsley, The influence of chemical compo-
sition on models of Type Ia supernovae. Frontiers in Physics 8(2), 168–188 (2013)
10. V. Gamezo, A. Khlokhov, E. Oran, Deflagrations and detonations in thermonuclear supernovae.
Phys. Rev. Lett. 92, 211102 (2004)
11. P. Ruiz-Lapuente et al., The binary progenitor of Tycho Brahe’s 1572 supernova. Nature 431,
1069 (2004)
12. P. Ruiz-Lapuente et al., No surviving companion in Kepler’s supernova. Astrophys. J. 862, 124
(2018). https://doi.org/10.3847/1538-4357/aac9c4
13. R.P. Kirshner, Supernovae, an accelerating Universe and the cosmological constant. Proc. Natl.
Acad. Sci. 96(8), 4224–4227 (1999)
14. M. Smith et al., Studying the ultraviolet spectrum of the first spectroscopically confirmed
supernova at redshift two. Astrophys. J. 854, 37 (2018)
15. M. Hamuy, Low-z type Ia supernova calibration, eds. A.W. Alsabti and P. Murdin (Springer,
Berlin, 2017), p. 1
16. T.J. Moriya, E.I. Sorokina, R. Chevalier, Superluminous supernovae. Space Sci. Rev. 214, 59
(2018)
17. M. Modjaz, Stellar forensics with the supernova-GRB connection. Astronomische Nachrichten
332(5), 434–447 (2011)
18. M. Fraser, Supernovae and transients with circumstellar interaction. R. Soci. Open Sci. 7,
200467 (2020)
19. S.E. Woosley, Bright supernovae from Magnetar birth. Astrophys. J. Lett. 719, L204 (2010)
20. M.P. Allen, J.E. Horvath, Influence of an internal magnetar on supernova remnant expansion.
Astrophys. J. 616, 346 (2004)
112 5 Supernovae
21. D. Kasen, L. Bildsten, Supernova light curves powered by young magnetars. Astrophys. J. 717,
245 (2010)
22. M. Bersten, O.G. Benvenuto, M. Orellana, K. Nomoto, The unusual super-luminous supernovae
SN 2011kl and ASASSN-15lh. Astrophys. J. Lett. 817, L8 (2016). https://doi.org/10.3847/
2041-8205/817/1/L8
23. L.I. Sedov, Similarity and Dimensional Methods in Mechanics (CRC Press, Boca Raton, 1993)
24. E.A. Dorfi, Evolution of supernova remnants including particle acceleration. Astron. Astrophys.
234, 419 (1990)
Chapter 6
Astrophysics of Compact Objects
The theory of Stellar Evolution discussed in Chap. 4 has given us the elements we
need to understand the problem that now occupies us: compact stellar remnants.
We discussed the evolution of low- and intermediate-mass stars (solar type) and
the transition to so-called “high-mass” stars, which proceed to explode after a rapid
final evolution. It is important to note that the existence of the two types separated
by the mass 8M should also be complemented with an evaluation of the relative
number of stars that produce the corresponding compact objects (white dwarfs and
neutron stars/black holes). We revisit the whole issue of the theoretical status and
observational evidence for compact objects in this Chapter. Figure 6.1 shows the so-
called initial mass function (IMF), i.e., the number of stars per unit mass (logarithmic)
as a function of the mass determined in various studies of the local environment,
clusters, and other systems.
The number of progenitors is generally expressed as being proportional to
(M/M )−α , and since the pioneering work of E. Salpeter in 1955 [1], the value
of the appropriate exponent has been found to be around 2.3. This means that the
number of stars that produce white dwarfs is at least 50 times greater than those
that explode. Thus, more than 95% of visible stars should form white dwarfs at the
end of their evolution. And taking into account a number of complex factors in the
evolution of the galaxy, we have come to the conclusion that there may be up to 1
billion white dwarfs available for study.
The statistics of the relative fraction of neutron stars and black holes is much
more uncertain. The number of stars that should explode is quite well known, but it
is not clear if there is a minimum value from which the production of black holes is
inevitable. This stems from the fact that the Physics of explosions for each case does
not offer a clear answer. To make matters worse, the initial angular momentum of the
collapsing core may turn out to be very important, even fundamental in determining
the explosion. There is a certain vague consensus that black holes would form in
parent explosions above about 25M , by the collapse of the core after ejection when
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 113
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_6
114 6 Astrophysics of Compact Objects
the matter that failed to shut down falls back on top of it (in the process called
fallback), or directly from imploding stars of around 40M or more [3]. The fact
is that in known X-ray binaries (see below and Chap. 7) there is no evidence for
black holes of more than around 15M , nor for the “very light” ones immediately
above the maximum neutron star mass and below around 5M , an observation which
has been suggested as determined by the very mechanism of the explosions. In the
case of explosions forming neutron stars, it is not clear what exactly the formation
channels would be. For example, collapse induced by the addition of a white dwarf
(AIC) appears as a recurring possibility, but there is no proof of its effectiveness. All
this makes it very difficult to evaluate populations, although we usually find 107 as
an indication of the number of neutron stars in the galaxy (pulsars and others) and
something like 106 for the black holes produced by stellar evolution [4]. We will
have a more accurate picture of this and other issues when we begin to analyze each
type of remnant below.
The long history of the study of white dwarfs began with an observation by F. Bessel
in 1844. By carefully determining the orbits of Sirius and Procyon, Bessel found that
there were systematic periodic deviations, and proposed the existence of undetected
“dark companions”. In the following decades, some candidates for “dark compan-
ions” were finally detected, even at visual magnitudes ≤10. In particular, 40 Eridiani
B was the object of in-depth study and, to general surprise, Russell, Pickering, and
Fleming showed in 1910 that this star was spectral type A, i.e., it had an effective
6.2 Theory and Observations of White Dwarfs 115
Fig. 6.2 Contemporary images of the Sirius A and B system in the optical range (left) and X-ray
(right). Credit: NASA, ESA, H. Bond (STScI) and M. Barstow (University of Leicester) (left);
NASA/SAO/CXC (right)
temperature in the range 7500–10000 K, considered to be very “white”. This did not
correspond at all to the expectation for a star of very low brightness [6].
The most obvious conclusion was that these stars were enormously dense, with
estimated densities of thousands of times the density of water. Only then could
a very low luminosity (remember that L ∝ R 2 T 4 ) and a very high temperature be
compatible, at the expense of greatly decreasing the radius R. In 1927 A.S. Eddington
expressed this strangeness in his characteristically humorous style:
[…] the message of the companion of Sirius when decoded, runs: “I am composed of matter
3000 times denser than anything you have come across. A ton of my material would be a
small nugget you could put in a matchbox”. What reply can one make to such a message?
The reply which most of us made in 1914 was: “Shut up. Do not talk nonsense”.
Eddington implicitly recognizes in this paragraph the need to apply new ideas to
study the behavior of matter at these densities. Evidently, the classical gas approach
cannot work in this situation, and it was the work of R.H. Fowler in 1926 which
laid the foundations for the treatment of the problem of the structure of Sirius B
(Fig. 6.2) and other white dwarfs, a name suggested by the necessary temperature
and radius. It is important to point out that modern Quantum Mechanics had been
completely formulated only two years before in 1924. Thus we have a perspective on
the revolutionary nature of these initial studies of white dwarfs, which are “natural”
laboratories of dense matter and led to one of the great physical achievements of the
new quantum approach.
116 6 Astrophysics of Compact Objects
As we said before, the inference of very high densities, where an ideal gas model
would not be viable, forced astrophysicists to consider the behavior of matter in
the regime already presented in Figs. 4.17 and 4.18. We will now see how it is
possible to obtain and justify a valid equation of state for that regime from elementary
considerations.
Let us consider once again the situation of having N electrons confined in a
volume V . The physical space available for each of them is in one dimension of
the order of x ∼ (V /2N )1/3 . The hypothesis that electrons are in the quantum
regime is equivalent to saying that they are now subject to the Uncertainty Principle
xp ≥ , so their typical momentum will be of the order of
N 1/3
p ≥ ≈ 2/3 1/3 . (6.1)
2x 2 V
The mean kinetic energy E K would be
p 2 2 N 2/3
E K = ≈ 7/3 2/3 . (6.2)
2m 2 V m
Therefore, the internal energy U is simply
2 N 5/3
U = N E K ≈ . (6.3)
27/3 V 2/3 m
This last relation is important for the following reason: in a totally general way,
Thermodynamics allows us to find the pressure (gas state variable) by differen-
tiating the internal energy with respect to the volume at constant entropy, since
the internal energy is one of the Thermodynamics potentials of the system. Hence,
P = −∂U/∂ V | S=const. . Thus, we have
2 N 5/3
P= , (6.4)
24/3 3V 5/3 m
As discussed in Chap. 4, and assuming that the degenerate matter that constitutes the
white dwarf does not produce energy by means of nuclear reactions, the structure of
these stars can be found by simultaneously integrating the equations of continuity
of mass and hydrostatic balance. The energy transport equation is also dispensable,
since degenerate electrons have a very high conductivity and thus dT /dr = 0. It
is assumed that the interior temperature is constant for this reason, except for the
outer layers where degeneracy ends and the gas is “normal” again, and the interior
temperature drops until it reaches the value of the surface, where there is black body
emission.
As in any system of two first order differential equations, we can combine dM/dr
and dP/dr to obtain an equivalent second order equation. This unique differential
equation is
1 d r 2 dP
= −4π Gρ . (6.5)
r 2 dr ρdr
We see that, as in more general cases, we require a relationship between P and ρ (the
equation of state, see Chap. 4), just like the one obtained for a degenerate electron
gas. For the purpose of a general treatment, a polytropic form P = Kρ is usually
introduced, a general case comprising the limits P ∝ n 5/3 and P ∝ n 4/3 relevant to
our case. Certain mathematical manipulations can render the problem more tractable.
For example, the exponent of the polytropic equation of state can be replaced by
another by writing = 1 + 1/n, where n is called the polytropic index.
We define a change of variables in (6.5) by
1/(1−n)
1/2
(n + 1)Kρc
ρ = ρc , r = aξ , a =
n
. (6.6)
4π G
With this change we can now get (6.5) in a dimensionless form, viz.,
1 d ξ 2 d
= −n . (6.7)
ξ 2 dξ ρdξ
This is known as the Lane–Emden equation in honor of the scientists who studied
it. Besides the formal problem of finding solutions, we should not forget that we are
looking for a description of white dwarfs. Thus, the boundary conditions imposed
by the Physics of the problem for the solution function (ξ ) are quite simple:
(ξ = 0) = 1 , (ξ = 0) = 0 . (6.8)
The first comes from the fact that ρ(r = 0) = ρc , and the second describes the fact
that dP/dr = 0 at the center, otherwise we would have pressure gradients (forces)
where M ≈ 0.
118 6 Astrophysics of Compact Objects
In general, the solutions of (6.7) decrease from a central value to a point ξ1 that
crosses the horizontal axis where (ξ1 ) = 0 (Fig. 6.3). This point is of interest to
us because we identify it with the star radius R, since it is where P = 0. The stellar
radius R can be expressed in general in terms of ξ1 by
1/2
(n + 1)K
R = aξ1 = ρc (1−n)/2n ξ1 , (6.9)
4π G
When physical units are restored, the results for low-density white dwarfs with =
5/3 (n = 3/2) are
−1/6
ρc μe −5/6
R = 1.22 × 10 4
6 −3
km , (6.14)
10 g cm 2
1/2 −3
ρc μe −5/2 R μe −5
M = 0.4964 M = 0.7 M ,
106 g cm−3 2 104 km 2
(6.15)
Fig. 6.4 Stellar sequences obtained by the integration of the Lane–Emden equation. In blue, the
sequence of star models built with the non-relativistic limit of the Fermi gas of electrons that
satisfies P ∝ n 5/3 . As expected, deviations are increasingly important as mass increases. At some
intermediate point (M ∼ 0.6–0.7M ), one must shift to the description with the ultra-relativistic
limit, where P ∝ n 4/3 (green curve), whose results are more and more accurate until it reaches the
value where its derivative becomes vertical (red dotted line). There are no stable models beyond
this value, the Chandrasekhar limit
For its multiple applications, the so-called Chandrasekhar limit (again, not to be con-
fused with the Schoenberg–Chandrasekhar limit presented in Chap. 4, which refers
to another physical situation) is one of the most important results obtained in the
20th century for the theory of Stellar Evolution. It is also highly significant that it
depends in a fundamental way on the ideas of Quantum Mechanics, very new at the
time of Chandrasekhar’s original work. We have already seen in Chap. 4 that the con-
cept of degeneracy is fundamental for the evolution of solar-type star cores. Without
this state, there would be no helium flash, for example. The theoretical physicist Lev
Landau reasoned that, since Chandrasekhar’s mass limit is so fundamental, he should
be able to demonstrate it with very simple arguments (he would be amazed by the
3D numerical simulations, etc., that are made today precisely with the intention of
discovering fundamental results). We will now present Landau’s argument, because
it will allow us to understand Chandrasekhar’s mass limit in a qualitative and simple
way [4].
6.2 Theory and Observations of White Dwarfs 121
cN 1/3 Gm 2B N
E= − . (6.18)
R R
However, there is a important feature regarding the existence of this minimum related
to the number of N fermions. If N is small, the first term dominates (since N 1/3 > N )
and E is positive. Thus, one can decrease the energy by increasing R. When the
star expands to decrease the energy, the fermions will at some point become non-
relativistic particles (E F → pF2 /2m ∝ 1/R 2 ) and the second term will now dominate,
making E → 0− , i.e., it will tend to zero from negative values, before if R → ∞.
Therefore, there must be a point of equilibrium for a finite value of the radius R (that
is, a star). But if we consider N large enough from scratch, E will be negative and
tend to −∞ if R → 0, that is, the configuration will collapse, because it will then
be able to decrease the energy indifinitely, and there is no possible equilibrium. The
boundary between “small” N and the “large” N that separates these two regimes
corresponds to a maximum value of fermions Nmax , precisely determined by the
condition E = 0 in (6.18), and which is easily calculated to be
3/2
Nmax ≈ ∼ 2 × 1057 . (6.19)
Gm 2B
Note that Nmax and Mmax depend essentially on universal constants, not on the
composition, since that was never needed in the argument. We can also show that
the equilibrium radius is determined by the condition of the onset of relativistic
degeneracy, viz.,
E F ≈ mc2 , (6.21)
where m is the mass of the particle whose pressure supports the star. Substituting for
1/3
Nmax in E F ≈ cNmax /Rmax and using the condition (6.19), we obtain
122 6 Astrophysics of Compact Objects
1/2
Rmax ≈ . (6.22)
mc Gm 2B
4.5(2 × 1033 ) g
ρc ≈ 3M/4π R 3 = ∼ 108 g/cm3
4π × 125 × 1024 cm3
giving an order of magnitude for the maximum density for white dwarfs, while
the densest becomes unstable above ρc ≈ 1015 g/cm3 , given an order of magnitude
for the maximum density for neutron stars. However, we must remember that in
the latter the effects of General Relativity and the interactions between particles are
important, and our basic Newtonian estimate presented here will not be very reliable.
In fact, it should be noted that we obtained as a result the existence of a maximum
mass without using GR concepts, while at high densities we will have to deal with
relativistic instability, the true cause of the maximum mass of a compact object in the
neutron matter regime. On the other hand, we can say that the maximum mass should
be approximately the same for both regimes of (6.18), within a small numerical factor
that cannot be determined by this simple calculation.
The fact that the white dwarfs were detected at the beginning of the 20th century is
another proof of the statement made at the beginning of the Chapter regarding their
abundance in the galaxy. Sirius B, 40 Eridiani B, and other binary white dwarfs are
examples of the presence of these objects in the vicinity of the Earth. There are many
others, most of them isolated, and some with magnitudes less than 12, accessible to
any amateur telescope. It is not difficult to find and observe white dwarfs.
But of course the systematic study of white dwarfs requires large samples and as
complete as possible. Thus, in addition to neighboring and “field” white dwarfs, there
are studies of old star populations, each of approximately the same age, responsible
for the production of white dwarfs: the star clusters, which are especially suitable
laboratories [10].
Figure 6.6 shows the case of the NGC 6791 cluster, where colors and luminosities
are used to identify the white dwarfs born from solar type progenitors that have
already completed their evolution. With samples of this type it is possible to study
6.2 Theory and Observations of White Dwarfs 123
Fig. 6.6 White dwarfs in the globular cluster NGC 6791. With high quality images the identification
is quite simple (the candidates are the dots in the circles), and one can extend the study by obtaining
complementary spectra. Credit: NASA, ESA, and L. Bedin (STScL)
the white dwarfs and related problems, such as the determination of the very age of
the cluster through the sample of white dwarfs.
All these properties like colors and spectra still require a detailed treatment of
the atmospheres of the white dwarfs, a region totally ignored in our discussion of
the structure because it is represents such an insignificant fraction of the total mass.
After all, it is responsible for the radiation that is ultimately emitted by these objects.
Figure 6.7 shows the situation graphically.
Observation of white dwarfs has the potential to determine several important
characteristics of their structure, for example, the theoretically calculated star radius.
If we call the observed luminous flux F(D), the basic equation L = 4π R 2 σ Teff 4
can
in principle be used to obtain the stellar radius:
L F D2
F(D) = =⇒ R 2
= . (6.23)
4π D 2 σ Teff
4
We see from (6.23) that, besides the distance D, we must determine the effective
temperature Teff . Although this is not impossible, there are several complications, as
exemplified in Fig. 6.8. The spectra of many white dwarfs have pronounced absorp-
124 6 Astrophysics of Compact Objects
Fig. 6.7 Complete structure of a white dwarf. The polytropic treatment presented above is valid for
most of the matter in the white dwarf, but not for the atmosphere, which hardly contributes to the
mass but which is where the degeneracy of the electrons ends and there is a transition to a classical
gas. Besides confirming the most common composition for the atmospheres (H/He), we will see
that more relevant situations are observed
tion lines that distort the spectrum with respect to the ideal, and thus make it difficult
to calculate the value of Teff . Here we see another advantage of the study of clusters:
the distance D is the same for all objects.
With the construction of increasingly complete databases, it has been possible to
classify white dwarfs using their spectra. This classification is shown in Table 6.1.
There are complicated evolutionary mechanisms that result in the transformation of
some types into others, but they will not be discussed here, since they involve the
Physics of the diffusion of chemical elements and other problems that go well beyond
the scope of this text.
However, we would like to highlight some novelties and an important recent
contribution to this problem: the DQ class with carbon lines was only recently dis-
covered, although its existence was expected. But finding a white dwarf with oxygen
and without hydrogen or helium was not expected at all. This is the case of SDSS
J124043.01+671034.68, discovered by Kepler, Koester, and Ourique [11], and which
should result from the more massive progenitors that still cannot explode. In fact, the
detection of neon and magnesium in the oxygen-rich atmosphere points to the white
dwarf coming from this type of core, very close to those that will produce electron-
6.2 Theory and Observations of White Dwarfs 125
Fig. 6.9 Theoretical mass–radius diagram for nearby white dwarfs. The upper curve corresponds
to a carbon composition, and the lower one to iron. According to the theory of Stellar Evolution, it
is impossible to have iron white dwarfs, although this would appear empirically to be the indicated
solution. For this reason, a re-evaluation of distances is indispensable, since the observed points are
expected to migrate vertically and correspond to theoretical expectations [12]. © AAS. Reproduced
with permission
Table 6.1 Empirical spectral classification of white dwarfs. The scheme is conceptually similar
to the spectral type classification created for “normal” stars, and reflects the previous evolutionary
history of each object
Spectral type Features
DA Only H, no He I or metals
DB Only He I, no H or metals
DC Continuous spectrum, no lines
DO Strong He II, He I or H may appear
DZ Only metals, no H or He I
DQ C lines of any type
capture supernovas (Chap. 5). There is still no spectral denomination for this unusual
object. The location of several white dwarfs in the M-R plane is displayed in Fig. 6.9.
One result of the above-mentioned studies is confirmation of the mass of progen-
itors that produce white dwarfs, as in the discussion above. NGC 2751 is an open
cluster with a white dwarf as a member. This membership is quite reasonable for a
white dwarf with a hydrogen atmosphere. The important fact is that the cluster has
quite massive stars that are still in the Main Sequence. Thus, the progenitor of the
white dwarf should be more massive than those that have still to begin their final
evolution (Fig. 6.10).
This work indicates that the basic ideas of Stellar Evolution are not terribly wrong.
Other cases of (multiple) white dwarfs in clusters have been published, and imply
somewhat lower limits to the progenitor mass in the Main Sequence, probably as
126 6 Astrophysics of Compact Objects
the result of a different metallicity (but certainly all above 6M ). A great source
of uncertainty in this problem is the mass loss in the giant branch and/or the AGB,
a factor that could cause stars that should explode to form white dwarfs (maybe
even around 10M ). On the other hand, if the largest mass that forms white dwarfs
were too low, there would be a very serious conflict with the number of supernovas
observed.
Another important issue in the study of white dwarfs is their mass distribution.
Internal composition and mass are expected to increase as we consider white dwarfs
that descend from more massive progenitors. However, in the low-mass range, it is
not possible to produce helium white dwarfs, since the cores merge this helium into
carbon (Chap. 4). There is a consensus in favor of the production of helium white
dwarfs but in binary systems only. And, as we have already said, those of greater
mass should be composed of oxygen with fractions of neon and magnesium, and
masses close to the Chandrasekhar mass.
Figure 6.11 shows the mass distribution obtained by Kepler et al. [14]. The max-
imum around 0.57M is very similar to the one obtained in other studies. There
are secondary maxima, tentatively associated with several formation channels, such
as He-light white dwarfs formed in binary systems, as already mentioned. At the
right-hand end of the histogram, one can see a white dwarf with mass 1.33M , quite
close to the Chandrasekhar limit. There are other cases of even higher mass, but they
are subject to confirmation.
The absence of nuclear reactions in the white dwarfs indicates that, from the moment
they come into existence, they can only shed their thermal energy content. Therefore,
a cooling theory must be formulated to study the population of the Galaxy as a whole.
6.2 Theory and Observations of White Dwarfs 127
The first term in brackets is actually the specific heat of the reservoir. Although we
have seen that it is the electrons that maintain the structure, their contribution to
the thermal energy is very small. The thermal reservoir is largely dominated by the
classical ions, so we write cVion = (3/2)N A kB /A. Hence, we have
1 M ∂ Tc
L = −6.4 × 10 7
. (6.25)
A M ∂t
Now the luminosity is a function of the central temperature variation Tc , which needs
to be evaluated. For this we will consider the white dwarf envelope, which contains
a very small mass but is the region where the temperature falls from the inner value
Tc to the final value Teff at the photosphere. With the hypothesis that the envelope
mass is Menv ≈ 0, that is, that it does not contribute to the total mass, we can divide
the transport equation by the hydrostatic balance equation to obtain
dTc 3 L κ̄
= . (6.26)
dP 4ac 4π G M T 3
In the envelope, in the interphase region we mentioned, the matter ceases to be
degenerate and its opacity is dominated by processes that have a Kramers form
(Chap. 4)
κ̄ = κ0 ρT −7/2 . (6.27)
128 6 Astrophysics of Compact Objects
Here, we assume that the degenerate pressure and the normal gas pressure are the
same, because we want to find the conditions for the transition. This yields a rela-
5/2
tionship between pressure and temperature, which results in Pc ≈ Tc . Inserting the
latter and the opacity (6.27) in (6.26), we can separate variables and integrate both
sides, with the result
7/2
L M μ 4 × 1033 Tc
= 1.7 × 10−3 . (6.28)
L M μ2e κ0 107 K
The last step is to substitute (6.28) into (6.25) and integrate over time to obtain the
time required to achieve a given luminosity:
5/7 −5/7
A M μe 4/3 L
tcool = 9 × 106 μ−2/7 yr . (6.29)
12 M 2 L
This result is due to L. Mestel [15] and constitutes the simplest cooling theory. We
can observe two very interesting characteristics of the expression obtained. The first
is that the cooling time tcool is inversely proportional to the atomic number of the ions,
i.e., the lighter composition white dwarfs cool more slowly for a given mass. But if
we consider white dwarfs of higher mass, they will cool down more slowly since they
have higher thermal content and are more compact for a given temperature, whence
their emission surface is smaller. Note that effects that may become important, such
as the emission of neutrinos from the interior in addition to the luminosity of photons
from the surface, have not been included. There is an uncertainty of about 20% due
to these factors and other simplifications used to obtain Mestel’s law (6.29).
There are several possible tests of cooling. One of the more interesting ones is to
find the cooling sequence in a cluster, since the stars are believed to have essentially
the same lifetime. Figure 6.12 shows the data for the M4 cluster. The white dwarfs
Fig. 6.13 Regions in the T –ρ plane where the state of matter changes as cooling progresses. Since
its birth, the white dwarf core has been at high temperatures, and this results in a “gas” (with
corrections to the ideal expressions, but still fluid). Below the dashed line, the parameter exceeds
the critical value, and carbon, oxygen, or even magnesium crystallize. It is still unclear whether the
geometry of the crystal is analogous to that of the terrestrial diamond (a cubic net) or something
much more exotic (a triangular net, never seen in laboratories)
are clearly separated, below the Main Sequence. The faintest magnitudes observed
correspond to luminosities of order 10−4 L . An important point is that there could
be no fainter white dwarfs, since the galactic disk is not old enough for this to happen.
Thus, a number of papers have suggested calculating the age of the galactic disk using
precisely the cooling of white dwarfs. The results are varied, but oscillate around
6–8 Gyr, which is consistent with other, independent arguments.
To conclude, we highlight an aspect of cooling that has not entered the discussion
above, but for which there is substantial evidence: it is the so-called crystallization of
the core material, expected at low temperatures. This crystallization is due to the fact
that, at high temperatures, thermal agitation keeps the ions in a fluid state (referred
to as a “gas” in Fig. 6.13). But if the temperature is low, Coulomb interactions of
the charged ions can locate the ions in the sites of a crystal lattice. A quantitative
criterion, obtained by studying numerical simulations of crystallization, is that the
quotient of two quantities reaches a value of about 180, i.e.,
1/3
ρ
(Z e)2 /r Z2 106 g cm−3
= = 2.3 1/3 ≈ 180 . (6.30)
kB Tc A Tc
107 K
When this condition is reached, the cooling regime changes, since crystallization
releases latent heat [4]. This latent heat makes the time tcool increase, since it con-
tributes to E th in (6.24). Thus, white dwarfs in the process of crystallization (from
the inside out) almost cease to cool down.
It is worth asking what evidence there is for such crystallization. Studies of the
oscillations of white dwarfs allow us indirectly to explore their interior (as is done
with terrestrial seismology). In particular, the oscillations of the white dwarf BPM
37093 of only 4 millimagnitudes (!) have been used, after a fit to theoretical calcu-
130 6 Astrophysics of Compact Objects
lations, to argue that at least 50% of its interior is crystallized [17]. There are other
examples of this extreme phenomenon taking place in one of the most hidden places
in the Universe.
This statement was published shortly after Landau refined his view of this subject,
showing that the densities involved must exceed those of the atomic nucleus and that
the maximum mass would be limited, in the case of a degenerate free neutron gas,
by the value 0.7M , a factor of about 2 lower than originally predicted. Like white
dwarfs, neutron stars provide an extraordinary example of the role played by Quantum
Mechanics in contemporary Astrophysics, now in the high-density regime (and just
before producing black holes). These calculations were repeated later with the help
of the relativistic structure equations of Tolman, Oppenheimer, and Volkoff (or TOV,
obtained in 1939) and were confirmed to a large extent. Following the example of
the hydrostatic equilibrium equation (4.4), these researchers [21] obtained a version
that includes the effects of General Relativity:
P P
G ρ+ 2 M + 4πr 3 2
dP c c
=− . (6.31)
dr 2G M
r2 1 −
r c2
This reduces to the Newtonian version provided that the terms of the P/c2 (which do
not contribute to the gravitational field in Newtonian gravity, but are present in GR)
6.3 Neutron Stars and Pulsars: Structure and Evolution 131
Mc2 ≡ E = − 0 A + SA
2/3
+ CZ
2
A−1/3 .
In this approach the nucleus is treated as a “little drop” of matter and its energy (mass
times c2 ) is assumed to be composed of a volume term (the first, proportional to the
number of nucleons A), another associated with the surface (the second, proportional
to A2/3 ), Coulomb corrections (third term), and other effects of minor importance
not shown here. The task here is to adjust the expression to reproduce masses of
known nuclei and then obtain the coefficients 0 , S , C . The next step is to calculate
what happens at the high densities including these nuclei, the neutron gas, etc. The
equation of state (EoS) is obtained by first minimizing with respect to A and Z ,
and imposing chemical and mechanical equilibrium between the neutron gas and
the nuclei. These four conditions allow us to express the total energy density as
a function of a single variable (usually the baryon density n B ), and then obtain the
pressure in the form
∂
P = n 2B = Pn + Pe + PL . (6.32)
∂n B nB
e− jμr
VBJ = jCj + VT , (6.33)
μr
where μ is related to the reciprocal of the mass of the particle (generically called
a meson) that is exchanged between the nucleons (Chap. 1), and VT are additional
(tensor) terms in the potential. In their classic work, Bethe and Johnson [26] besides
the attractive interactions due to the exchange of pions, considered the dominant
effect of the vector meson ω, largely responsible for the repulsive core, to contribute
a term Vω = gω2 e−μω r /r , where gω2 /c ∼ 29, as derived from laboratory scattering
data. The Bethe–Johnson EoS, which they called model I, is obtained by combining
= 236n 1.54
B MeV/particle + m p c2 , (6.34)
nB
∂
P= n 2B = 364n 2.54 MeV/fm3 . (6.35)
∂n B nB B
6.3 Neutron Stars and Pulsars: Structure and Evolution 133
Fig. 6.14 Example of the difference between state equations calculated in the ultra-dense regime.
The upper curve is derived by considering only n, p, e− , and μ− . The bottom curve includes ,
which behaves as a kind of massive neutron, and the heavy hyperon − . For the same density, the
second produces much less pressure. The saturation density is indicated by an arrow [5]. Credit:
Isaac Vidaña (CFisUC, Department of Physics, University of Coimbra)
For even higher densities (2–3ρ0 ), the very idea of “potential” fails (since it is a
classical concept), and one must calculate the energy per nucleon using sophisticated
techniques to obtain /n B and then the pressure from
∂
P = n 2B .
∂n B nB
This difficulty means that, including more massive species such as the hyperons
and many others known from the laboratory, and depending on the type of treatment
performed, there are substantial differences in the equations of state in the densest
regime, which holds for over 90% of the mass of the star. This situation is illustrated
in Fig. 6.14.
Finally, at the relevant densities, it may be that the degrees of freedom are not
those we know from conventional nuclear Physics. There is strong evidence, both
theoretical and experimental, of a phase transition where the nucleons release their
fundamental constituents, quarks and gluons, under extreme conditions of tempera-
ture and pressure. While the RHIC and LHC experiments mainly explore the “hot”
region of high temperature and low chemical potential (and consequently low den-
sity, since to a first approximation ρ ∝ μ4 for relativistic matter), the domain of
Astrophysics is a “cold” region close to the axis of the chemical potential μ, due to
the fact that even the highest possible temperatures (tens of MeV) in supernova col-
lapses are still very small when compared to the Fermi energy μ. For several decades
there were improvements in the experiments to reach the quark–gluon plasma (QGP)
134 6 Astrophysics of Compact Objects
region, and finally this phase was apparently detected in heavy ion collisions. We do
not know to what extent these quarks are necessary to explain the interiors of neutron
stars [27, 28].
Fig. 6.15 The existence of equilibrium solutions is only possible when the slopes of the cold EoS
and the “gravitational EoS” are different. The passage from the non-relativistic electron gas to
the ultra-relativistic electron gas brings about the end of the stable model sequence. It is not until
saturation density is reached that matter, now dominated by neutrons, is able to change the EOS
slope again and stabilize neutron stars
solutions useful for modeling stars has been discussed by Lake [24], and contains
the constant density solutions, Tolman V, and seven other cases.
A particular case of great interest leads to the so-called Rhoades–Ruffini limit
[29], directly derived from a constant density approach. The reasoning behind this
calculation is as follows: the greatest possible mass that can be supported by dense
matter should occur when it is as “hard” as possible, that is, when dP/dρ = c2 , and
in fact a variational calculation confirms this expectation. If the EoS is considered
known below a certain transition density ρT (described, for example, by the BBP
EoS or similar), and also dP/dρ ≥ 0 locally to avoid any instability and collapse,
then the maximum mass of the cold star sequence is
1/2
4 × 1014 g cm−3
MRR = 3.2 M . (6.37)
ρT
This value can then be taken as an absolute limit, since there is no way to introduce
any physical ingredient that makes the EoS violate causality, provided one assumes
spherical symmetry, the absence of additional effects, and the validity of General
Relativity. Obviously, the effects of rotation, for example, can slightly increase the
value of the maximum, but only by about 20% or less. We will see that realistic
neutron star models effectively keep the maximum mass of the sequence below this
value. The general diagram of the mass-radius relations is shown in Fig. 6.16.
136 6 Astrophysics of Compact Objects
Realistic neutron star models need to go far beyond this initial simplicity, as there
are numerous developments in hadronic Physics to be incorporated, and they must
also treat in detail the outer layers, where the magnetic field is anchored (see below)
and the surface emits photons. A cross-section from a typical model is shown in
Fig. 6.17.
In general, all models have an “atmosphere” from ρ = 0 to some 106 g cm−3 ,
composed of non-relativistic electrons and nuclei whose composition may depend
on the fall of material in the supernova at the moment of birth. This characteristic is
important since virtually all observable features including spectra are determined on
this surface, where strong magnetic fields can affect the fundamental state, producing
highly deformed nuclei and influencing these observed quantities [30].
Above about 106 g cm−3 , and up to the value 4 × 1011 g cm−3 , the matter consists
of relativistic electrons and a solid lattice of nuclei governed by Coulomb electrostatic
forces. This is the so-called outer crust. We have already seen that, in addition to the
neutron dripping point 4 × 1011 g cm−3 , such a gas coexists with the nuclear lattice,
up to the nuclear saturation density ρ0 ∼ 2.4 × 1014 g cm−3 . The neutrons (without
electric charge) are superfluid in this condition, i.e., they pair with each other and
move without resistance.
Above the saturation density ρ0 and up to the highest densities reached in the
center, the uniform matter can at first be described with an EoS of the potential type,
but then becomes uncertain, since the possibilities of the composition range from
hyperon, meson, and/or quark condensates. This liquid lump remains one of the main
unknowns of the structure, and contains more than 90% of the star’s mass [27].
Star models are constructed by numerically integrating the structure equations
for each chosen supranuclear EoS, and for each range of densities as the transition
density points are reached, and this produces model sequences like those represented
in Fig. 6.18. The models are more compact the higher their mass, and those that
exceed the causal limit must be excluded (that is, they cannot enter the diagonal
Fig. 6.18 Mass–radius diagram for neutron stars. Horizontal stripes indicate the highest measured
masses, above about M = 2M . The stiffest equations of state are on the right, where the maximum
masses are higher and the radius greater. The opposite happens with the softest equations of state on
the left [31]. The three equations of state that contain quarks behave differently: the smaller masses
have the smaller radii, because one assumes that the quarks are absolutely stable in the sense that,
once released, they do not produce hadrons again. Some of the models in this figure are called
hybrids (normal matter + quark core), because the quarks appear only at high pressure
138 6 Astrophysics of Compact Objects
gray range). This happens because, close to the maximum mass limit, the speed of
sound in matter can exceed the speed of light, whence the calculation results become
inconsistent. As the EoS is “harder” or “softer”, i.e., it produces more or less pressure
for a given energy density, the masses of the maximum mass models are higher or
lower, respectively, while the corresponding radii follow the reverse trend. Thus one
can construct the mass–radius diagram presented here.
To compare the models and the actual masses, the most widely used and successful
method to obtain the latter has been the application of Kepler’s third law in binary
systems which contain at least one NS. The observations directly provide the so-
called mass function:
3
M2 sin2 i T v3
f (M1 , M2 , i) = 2
= , (6.38)
M1 + M2 2π G
since the binary period T and the projection of the orbital velocity of M1 along
the line of sight v can be measured. The angle of inclination i can sometimes be
estimated, e.g., from the observation of eclipses or by combining data in other bands
of the visible star, and then the masses M1 and M2 can be determined. The best
results to date correspond to systems where the two components are neutron stars,
but there are others where the companion M1 is a white dwarf, an evolved star, or even
a MS normal star. The measurement of orbital features plus the so-called Shapiro
delay, the delay of the pulses when they “fall” in the potential well of the secondary,
has produced very accurate values for a few pulsars (see, for instance, [31]) and
is considered a “clean” observation because it is based purely on gravitation. The
Shapiro delay is detectable only when the effect is large enough, which is why white
dwarfs and a favorable geometry of the orbital plane are needed. The search continues
for other systems in which such measurements would be feasible.
Another possible way to obtain information is to extract the effective temperature
of a neutron star from observations, and with a distance evaluation, calculate the
radius using the relation (4.1). But even though the spectra seem really thermal,
with associated temperatures of 1–10 keV, the radii obtained are very small (around
3–5 km). The consensus is that the temperature obtained is not really the temperature
of the whole star, but only of a “hot spot” (for example, polar caps) and that for
this reason nothing can be said about the spherical structure, although there have
been studies that have argued in favor of a quark star with a radius much smaller
than 10 km. A variety of techniques are in progress to try to obtain the radius and
mass simultaneously, but with somewhat conflicting results. A clear example of
precise and reliable measurements of neutron star radii, essential to our understanding
of cold, catalyzed matter beyond nuclear saturation density, have been provided
recently by NASA’s Neutron Star Interior Composition Explorer (NICER). High-
quality data sets that have yielded measurements of the mass (M = 1.44 ± 0.15M )
and radius (R = 13+1.2
−1.0 km) of the 206 Hz pulsar PSR J0030+0451, and of the radius
6.3 Neutron Stars and Pulsars: Structure and Evolution 139
Fig. 6.19 Neutron star masses determined in binary systems. The best determinations are those
corresponding to two neutron star systems, with more than one example of a “binary pulsar”. Note
that it can be proved that the distribution is not consistent with a single mass of 1.4M , as stated
in the literature before the 21st century. There are systems where the neutron star must have been
added to M ≥ 0.3M and the distribution is at least bimodal, with one scale around 1.4M and
another at around 1.8M . A “third peak” (or rather, the “zeroth” peak) must be present at 1.25M
with high significance in the current data, possibly associated with neutron star production by
“light” progenitors (8–10M in the MS) which develop degenerate O–Mg–Ne cores that collapse
by electron capture. Compilation by L.S. Rocha [32]. See also [33]
(R = 13.7+2.6
−1.5 km) of the M = 2.08 ± 0.07M , 346 Hz pulsar PSR J0740+6620
[34]. These numbers suggest that the radius is very similar for objects separated by
about 0.5M , a feature indicative of a stiff equation of state. The compilation of
masses updated 2021 is seen in Fig. 6.19.
Zwicky [20] about the compact remnants of supernovas, when the direct detection
of the Crab pulsar was announced shortly afterwards, with an unsustainable period
for a white dwarf pulse.
The basic idea of Pacini and Gold was to assign the radio pulses to the passage
of an emission beam through the line of sight of the observer. In this way, the period
of the pulses results directly from the rotation, but this mechanism also requires a
magnetic field. Away from the object only the dipole should be important. A rotating
dipole should lead to the torque exerted by the radiation that comes out and brakes the
star. As the available energy source is the rotation itself, the losses should be equal
to the change in the rotation energy, leading to the dynamic equation of a “spinning”
magnetized star, with the rotation and magnetic axis at 90 degrees, so that the factor
sin 2 α = 1:
2 2 6 4
I ˙ =− B R , (6.39)
3c2
where the coefficient 2/3c2 corresponds to the radiation emitted by a rotating dipole
in vacuum. Further research has shown that the electric field induced by this rotating
dipole is so gigantic that a vacuum is not possible around the neutron star: electrons
and protons are pulled from the surface by the induced electric field, and form a
region around the pulsar where the dynamics of the particles is dominated by the
magnetic field, which is thus called the magnetosphere.
After fifty years of research, this (classical) problem of the rotating dipole and
the induced currents has not yet been fully solved. Although there are approximate
solutions, the coefficient in (6.39) and other relevant quantities still cannot be calcu-
lated accurately [38]. In particular, we can calculate neither the detailed structure of
the magnetic field, nor the flow of particles (or radiation) escaping along the open
lines of the magnetosphere in the form of a relativistic wind, clearly visible in X-rays
(Fig. 6.20).
However, assuming that only the dipole emission contributes to braking the rota-
tion, and that the magnetic field does not change throughout the life of the star, we
can integrate (6.39) with respect to time to give
n−1
t =− 1 − n−1 , (6.40)
˙
(n − 1) i
where i is the initial rotation speed of the pulsar and n is the so-called braking index,
which measures the braking of the object as the radiation flows. If the pulsar was born
rotating much faster than today, the term in brackets is approximately unity and we
can define the characteristic age τ = P/2 Ṗ as the typical timescale for the rotation
to decrease. A pure dipole leads to the value n = 3, but the definition n = ||/ ¨ ˙2
can be formulated as a directly observable quantity if the speed of rotation and its first
two derivatives are determined [4]. Table 6.2 displays the values for 6 pulsars where it
has been possible to determine the 3 quantities (especially the tiny second derivative
¨ and all of them differ from the expected value, in some cases substantially. This
),
difference indicates that the energy loss is not purely from dipole radiation, and that
other factors are involved. This discrepancy is not new. The Crab pulsar, for example,
6.3 Neutron Stars and Pulsars: Structure and Evolution 141
Fig. 6.20 Left: X-ray image of the Crab pulsar, oriented in a similar way to the schematic figure in
the sky plane. Credit: NASA/CXC/SAO/F. Seward et al. Right: Schematic of a pulsar showing the
co-rotation of the particles with the neutron star right out to the light cylinder (line indicated with
the arrow) and the emission in the direction of the observer. Credit: NRAO
emits around 1031 erg in pulsed radiation, while I ˙ > 6 × 1038 erg, i.e., although
it is assumed to be the main emission, the dipole radiation takes no more than a small
fraction of the total energy. The presence of other energy fluxes is necessary, and in
particular there is the unequivocal detection of a particle flow (wind), which collides
with the circumstellar material to produce X-rays, as observed in Fig. 6.20.
Despite these uncertainties and reservations, the dynamic equation (6.39) is widely
used in the form
P Ṗ
B = 1013 G, (6.41)
1s 10−13 s s−1
Fig. 6.21 The log Ṗ–log P diagram for pulsars and similar objects [45, Fig. 2]. Lines B = constant
are shown explicitly. Ordinary pulsars are grouped in the central region. Other neutron stars populate
this fundamental diagram: the magnetars in the upper right corner (AXP-SGR, in red) and the
millisecond pulsars (recycled) in the lower left corner. The blue diagonal line is called the “death
line”, and marks where the pulsars are too slow or too demagnetized to produce emission. Note the
set of pulsars associated with a supernova remnant, marked with an ellipse. Credit: V.M. Kaspi
sin2 α ∼ 1. With (6.41) and the characteristic age τ of (6.40), we can calculate the
paths of the pulsars in the log Ṗ–log P diagram, which yields straight lines for each
set B = constant (Fig. 6.21). Although the field was expected to decay due to Ohmic
dissipation in the crust, the existence of a large group of pulsars with very high values
of τ and very intense fields has somewhat discredited the field decay theory, but not
completely ruled it out.
The presence of pulsars detected only in the high-energy bands and not in radio
deserves a comment. While it has been suggested that pulsed radio emission is
incoherent (although the standard picture points to a coherent nature), optical, X,
and γ emissions are surely coherent, and thus result in a proportional density of the
emitting particles. There are several mechanisms that can explain these emissions.
The most interesting is the presence of thermal radiation from the surface residue
of the content of the birth, since it provides information about the cooling processes
and therefore the state of the interior. The charged particles in the magnetosphere
and winds can produce incoherent emission. In particular, there are a number of
important detections in the γ bands that should help to explain the classification of
pulsars, but the content of the emission at the higher energies remains open.
In relation to the “other” types of neutron stars shown in Fig. 6.21, the most
extreme is the magnetar. In the decades following their discovery, some sources were
identified which, besides presenting an emission in X-rays much greater than I ˙
(that is, the energy of the observed emission cannot be obtained from their rotational
6.3 Neutron Stars and Pulsars: Structure and Evolution 143
energy, which is insufficient), have long periods, longer than 1 s, and values of the
derivatives well above those of ordinary pulsars (of order 10−10 s s−1 ). Thus, (6.41)
shows that their magnetic fields should be 1014 –1015 G, and for this reason they
became known as magnetars. This class of sources is also observed in γ rays, often
in the form of bursts and intense activity (Fig. 6.22). The idea of the magnetar model
is that the sudden dissipation of magnetic energy is responsible for this phenomenon.
The model was applied to the group called soft-gamma repeaters and anomalous X-
ray pulsars (SGR-AXP), considered to be different manifestations of neutron stars
with extreme magnetic fields [39]. However, there are recent detections such as
SGR 0418+5729, with an estimated magnetic field of 7.5 × 1012 G, much lower
than the others, and they are sometimes detected in radio. This requires a rethink of
the magnetar scenario, since with such a low-field it may not be possible to extract
enough rotational energy to explain the observed X-ray emission.
The association of pulsars with supernova remnants is today a well-established
idea, although it has been confirmed in less than 20% of the more than 200 SNR
known in the galaxy. These associations are an important problem. Most remnants
should not be associated with a pulsar, since type Ia supernovas do not produce pulsars
and there should also be cases where the product was a stellar black hole (Chap. 5).
Another important factor is that pulsars are born (on average) with high proper motion
due to the birth process, and reach speeds of the order of 400 km s−1 . Thus, pulsars
often “punch” the edge of the young remnant and escape. Finally, the emission beam
can point away from the Earth (in 70–80% of cases), and identification of the SNR
itself becomes almost impossible after more than 105 yr (Chap. 5). Several remnants
have been associated with magnetars, but there are major problems in confirming
these associations. The attempt to associate magnetars with massive star clusters
that may have been their progenitors also presents problems. Although it has been
suggested that a number of cases indicate progenitors of more than 40M (raising
144 6 Astrophysics of Compact Objects
Fig. 6.23 The globular cluster 47 Tuc in the optical (left) and X-ray (right). Some of the pulsars
belonging to this system are marked with red stars. In the X-ray image we can see these pulsars
and other high-energy systems that emit intensely [43]
the question as to why such massive stars did not form black holes), there is at least
one case in which the cluster still has stars of mass of the order of 17M on the MS,
casting doubts about the higher masses [40] (unless NS/BH formation is actually an
intermittent function of the mass of the progenitor, going from one to the other and
back).
Finally, Fig. 6.23 also shows the millisecond pulsars, a class that includes the
fastest rotating object known today, PSR J1748-2446ad in the Terzan 5 cluster, with
P = 1.4 ms (or a frequency of 716 Hz). This object is another representative of a class
detected more than 30 years ago, and which contains an important number of pulsars
in globular clusters. As the clusters have not suffered many collapse supernovas, an
alternative channel of formation by the addition of matter on top of a white dwarf has
been postulated, known as accretion induced collapse (AIC), where electron capture
should be faster than carbon ignition (see Chap. 5). It is believed that the ultrashort
periods of millisecond pulsars may be due to recycling [41], that is, to the process
in which the addition of the “normal” companion transfers angular momentum and
thus accelerates the rotation. Conditions for the existence of these systems are highly
favorable in clusters, and indeed most of the pulsars detected in them are in the
millisecond class, although there may be some with the original rotation, without
having undergone recycling.
The latest discovery of an entirely new class of compact stars is the so-called
Rotating Radio Transient Sources (RRATS), which emit sporadic radio pulses in
phase, separated by several hours, and may constitute the dominant population of
the disc, given their characteristic ages and detection difficulties. In other words,
RRATs would be the overwhelming, almost totally silent population of neutron stars
that will continue to be a frontier research topic for decades to come, even more so
after the first detection of the fusion of two into gravitational waves and radiation
across the entire electromagnetic spectrum (Chap. 10).
6.4 Physics and Observational Manifestations of Black Holes 145
The long history of the black hole idea has two illustrious precursors in the late 18th
century (!). Within a few years of each other, Englishman J. Michell and Frenchman
Pierre-Simon de Laplace (Fig. 6.24) discussed the possibility of dark stars based on
Newtonian ideas about the escape velocity of corpuscles of light from the star surface.
Note that these arguments are based on the Newtonian concept of the corpuscular
nature of light, otherwise there would be no way Newton’s force of gravity could
attract it. But despite this, Michell and Laplace’s reasoning opened the way to the
modern study of black holes and deserves a discussion [44].
As is well known, one uses energy conservation to obtain the critical condition
for a particle to escape from the surface of a body of mass M and radius R :
1 2 G Mm
mv = . (6.42)
2 R
If we set v = c, applicable to light in this Newtonian corpuscular approach, we find
that when the radius reaches the critical value R = 2G M/c2 , the gravitational field
will not allow the particle to escape. Thus, the compact object that reaches this
condition will appear to be “dark”, and invisible to outside observers.
Two centuries later, when the General Theory of Relativity was formulated, formal
solutions were discovered in which, instead of considering a physical surface for
the emission of light, there is an imaginary surface called the event horizon from
which no point outside can receive any signal from the interior. The exterior and
interior of the horizon are causally disconnected. This is a consequence of the strong
curvature induced by the concentration of mass. Moreover, right in the center there is
146 6 Astrophysics of Compact Objects
Fig. 6.26 Carter diagram [47]. The grey region is the black hole domain, that is, mass compressed
beyond the density inversely proportional to RS3 forms the so-called event horizon (see text). Note
that as the mass increases, the effective density decreases. The Michell–Laplace black hole is
indicated, since they imagined that the density was kept constant at 1 g cm−3 , as in the Sun. The
Universe itself could enter its Schwarzschild radius and form a black hole without us noticing
6.4 Physics and Observational Manifestations of Black Holes 147
critical value for R = 2G M/c2 ≡ RS , which by chance is exactly the value of the
so-called Schwarzschild radius obtained rigorously in General Relativity, and which
marks the position of the horizon. We can gain some understanding of RS by noting
that the bracketed term in the denominator of (6.31) is 1 − 2G M/r c2 , and while
2G M/r c2 ≈ 0.1 in neutron stars, black holes with RS = 2G M/c2 make this term
zero, and the TOV description no longer makes sense. Physically, we can think of
neutron stars as tightly “packed”, but the pressure still resists gravitation. On the
other hand, black holes reach a critical level “packing” and all the matter disappears
behind the horizon. Thus, we will not need to impose any equation of state, since
black holes result in “pure gravitation”.
Carter’s diagram [47] has on the vertical axis the “density” of the objects. This
may seem a bit strange, since we have just said that matter is not present and has col-
lapsed within the Schwarzschild radius. However, it is always possible to define a for-
mal density ρBH = 3M/4π RS3 which, combined with the definition RS = 2G M/c2 ,
implies that ρBH ∝ 1/M 2 . Black holes with mass much greater than 106 M are called
supermassive. There is one in the center of our galaxy, in Sgr A∗ . They are much
less dense than water, while a miniature black hole with mass much less than M is
much denser than a neutron star.
Until the second half of the 20th century, isolated black holes were not expected
to be very interesting. On the other hand, those that accrete matter from a companion
(stellar case) or the circumstellar medium (supermassive case) are of great interest,
as we will see below. But even those that do not have a companion were studied
and a very interesting result was obtained: the very intense gravitational field near
the horizon has the property of causing vacuum fluctuations (see Fig. 1.2), and one
of the particles in the resulting particle–antiparticle pairs is sometimes absorbed
behind the horizon, while the “orphan” companion escapes the system to infinity.
Adding up all the contributions, the total spectrum is thermal, with a temperature
(called the Hawking temperature) inversely proportional to the mass of the black
hole (Fig. 6.27). This is a heuristic justification of the Hawking radiation, expected
theoretically from an otherwise inert black hole.
Using the discussion in Chap. 1, we can substantiate these statements and obtain
the Hawking temperature. The emission of a photon near the horizon implies that
the uncertainty in its position is of the order of the radius RS , that is,
2G M
x ≈ RS = . (6.43)
c2
From this we can immediately calculate the uncertainty p in the momentum as
c2
p ≈ = . (6.44)
x 4G M
The typical photon energy is thus E γ = pc = c3 /4G M. Associating a tempera-
ture TH with this characteristic energy and introducing a numerical factor 4π that
cannot be easily obtained with this simple estimate, we find
148 6 Astrophysics of Compact Objects
Fig. 6.27 Vacuum fluctuations as the source of Hawking radiation. Most of the time the particle–
antiparticle pair will annihilate without great consequences, as happens in the absence of gravitation.
Sometimes the pair will fall behind the horizon, but it also happens that only one of the particles
is captured while the other escapes. When one sums over the latter particles, one finds that black
holes emit radiation with the spectrum of a black body [48]
c3
TH = . (6.45)
8π G M
That is, the temperature of the black body emission goes as the reciprocal of the
black hole mass. Numerically, we can express this in K as
M
TH = 10−7 K. (6.46)
M
From (6.46) it is evident that Hawking’s radiation is very weak and totally unob-
servable, unless the evaporating black hole is close to total disappearance (M → 0).
One of Hawking’s suggestions was to monitor very brief gamma ray bursts as a sign
of the end of evaporation.
However, there is another context in which Hawking radiation can be very impor-
tant: for the fate of primordial black holes, produced very early in the history of
the Universe. There are several possible mechanisms for black holes to form. In the
simplest (collapse of large density fluctuations), the candidate mechanism must be
able to create fluctuations ρ/ρ ≥ 1/3 in the nearly homogeneous primordial mat-
ter. The fluctuations detected today on a variety of scales by monitoring the cosmic
background radiation (i.e., temperature fluctuations that reflect fluctuations in den-
sity at the time) are of the order of 10−5 , and it is possible that there are large enough
fluctuations, as required, but that they remain “hidden” at small scales. Regardless
of this, we can still discuss what the evaporation of black holes would be like in the
cosmological context. The Hawking emission, identified with that of a black body
with temperature TH , implies a loss of energy from the black hole at a rate
As we have seen before, RS ∝ M and TH ∝ 1/M, and the rate of loss of mass (energy)
of the black hole is then
dM A
=− 2 . (6.48)
dt M
On the other hand, primordial black holes are immersed in a very energetic environ-
ment and absorb particles and radiation from the environment. A complete calcula-
tion of the cross-section (which takes into account the fact that black hole gravitation
increases the geometric section) results in σ = (27π/4)RS2 , larger than the geometric
cross-section because of the attractive effect of gravitation. This absorption effect
can only be important in the so-called radiation-dominated era, because when mat-
ter begins to dominate the expansion, the flux of energy falling into the black hole
becomes insignificant. Using the cross-section and the fact that F ∝ cρrad ∝ Trad 4
,
and assuming that the black holes constitute a dilute gas and do not contribute much
to the dynamics of the Universe, we find
dM A
= − 2 + B M 2 Trad
4
, (6.49)
dt M
for the evolution of the hole mass with time, where A and B are calculable constants
depending on the time considered and Trad = Trad (t) is the temperature function
derived from Friedmann’s equations for the evolution of the Universe. Note that,
keeping only the first Hawking term, we can integrate to obtain the initial mass of
a black hole that would be evaporating today, corresponding to a point object, but
with a mass similar to that of an asteroid:
M ≡ MH ≈ 5 × 1014 g . (6.50)
Inclusion of the absorbed energy leads us to define a curve called the critical mass
MC (t) that separates the regions where the black hole absorbs energy or evaporates
(Fig. 6.28). This stems from the condition dM/dt = 0 in (6.49) [49]. The critical
mass is a property of the environment and, in the radiation-dominated era, has the
value
T0
MC (t) = 1026 g, (6.51)
Trad (t)
where T0 is the cosmic temperature at which the primordial black holes form. These
developments continue today, and serve, for example, to identify which times and
mechanisms would have given rise to primordial black holes that would evaporate
today or contribute to the observed IR, radio, etc., radiation backgrounds. A complete
overview can be found in [50].
150 6 Astrophysics of Compact Objects
Fig. 6.28 Evolution of a black hole in the early Universe. If formed above the critical mass at the
time, the black hole has almost constant mass (horizontal path) until it reaches the instantaneous
value of MC in a future time, and will start evaporating from this instant. Zel’dovich and others were
concerned about the possibility of black holes growing explosively, thus swallowing a substantial
fraction of the Universe, something that does not happen in practice. The delay before the black
holes begin to evaporate may be substantial [51]
Until the middle of the 20th century, research on black holes had a very different
character from today’s. Mathematical and formal aspects of the solutions, alternative
theories, imaginary experiments, and other problems of this type were discussed.
However, the observation of black holes were never considered, and neither was the
modern idea of the production of black holes in massive star collapses, while the
concept of a supermassive black hole appearing on the right of Carter’s diagram was
never formulated in connection with any observation.
What really boosted the empirical study of black holes was the discovery of
quasars in the 1960s (more details in Chap. 8), since the energy source pointed
to a highly efficient emission mechanism, and giant mass black holes were then
seriously considered for the first time. It was around this time that the physicist John
Archibald Wheeler pulled off a major “publicity stunt” when, in a 1967 lecture, he
called solutions with an event horizon black holes (although there were antecedents
to this name), rather than “frozen stars” as they had been called in Russia. The name
totally changed our view of these objects, and the consideration of quasars brought
them into the realm of reality, while they remained interesting for mathematics.
A particular aspect of the study of “real” black holes is the behavior of light
when emitted by sources near the horizon, since we know that curvature produces
significant distortions with respect to the usual propagation. Simulated images of the
effects of a black hole on light have been produced since then (Fig. 6.29), but not yet
directly observed.
Explorations of the distortion of light by the gravitational field has prompted an
ambitious initiative to image the neighborhood of a black hole’s event horizon. It
6.4 Physics and Observational Manifestations of Black Holes 151
Fig. 6.29 Gravitational lensing: the distortion of images by gravitation. Upper: Deflection of
light by the gravitational field of a large object, such as a galaxy or cluster of galaxies, produces
multiple images of a source depending on the exact geometry (rLS , rL , etc.). Credit: Bill Saxton,
NRAO/AUI/NSF. Lower: Example of a nearly complete arc image of a galaxy with a background
distorted by a cluster situated in front of it. The passage of light near the event horizon of a black
hole is an extreme case of this phenomenon, already demonstrated in other extragalactic systems.
Credit: ESA/Hubble & NASA
Fig. 6.30 Left: The Event Horizon Telescope showing some of the instruments [52]. Right: The
aim is to produce images with μarcsec resolution, enough to “see” the supermassive black hole in
Sgr A∗ at the center of our galaxy. Credit: EHT Collaboration
should be possible to observe image distortions directly and even compare various
possibilities that arise from different theories of gravitation against the prediction of
GR. It is clear that this requires a huge angular resolution, since a black hole occupies
little more than a point. These observations are conducted by the network called the
Event Horizon Telescope, using the largest possible baseline, on the order of the
Earth’s diameter (Fig. 6.30). An angular resolution of order the μarcsec is required
to produce images of the type shown.
152 6 Astrophysics of Compact Objects
In April 2019 the EHT collaboration announced its first concrete result [53], i.e.,
the first image of a black hole horizon, for the black hole at the center of the M87
galaxy (Fig. 6.31). The image shows a bright ring caused by the distortion of light
through the gravitational field of the black hole, and the dark region that is produced
by capturing photons through the horizon, also known as the “shadow” of the black
hole. The horizon is in fact much smaller than the dark region, but the “shadow” is at
the limit of what can be imaged. Comparison with numerical simulations indicates
that General Relativity reproduces the measured distortion well, and there is little or
no evidence for an alternative theory, a result that should be confirmed in the future
for this and other cases when the resolution is improved and can constrain alternative
theories more tightly.
We now describe the two categories of black holes for which we have a certain
amount of information (the first category, already discussed, comprised the primor-
dial black holes, but it remains unconfirmed). The closest and most abundant category
are the black holes produced by high-mass stars, possibly starting out with masses
of 25M or more in a prompt fashion, and alternating with neutron stars below this
range. Although this lower limit is uncertain, there is a strong consensus that, above
a certain threshold value, the iron core and the resulting explosion dynamics will not
be able to produce a neutron star, whence the result will be a black hole of stellar
mass. From an observational point of view, the very nature of black holes suggests
that there will only be possibilities of observing them successfully when they are
members of binary systems. These may or may not be in states of accretion, depend-
ing on the evolution of the companion and the orbit. But for those systems in which
it is possible to determine a mass for the compact object, the Rhoades–Ruffini limit
in (6.37) provides a very reliable test to distinguish between a neutron star and a
black hole. Figure 6.32 shows a diagram of a binary sample where this determina-
tion was possible. The objects listed are black hole candidates because they exceed
the Rhoades–Ruffini limit (explicitly indicated). The existence of small-mass black
holes and massive neutron stars, thereby eliminating the so-called mass gap, is an
important topic in compact object Astrophysics.
One of the notable examples of the identification of binary systems in Fig. 6.32 is
the extragalactic X-ray binary M33 X7 (Fig. 6.33). Every 3.45 days the companion, a
star of mass around 70M , is eclipsed by the disk that passes through the line of sight.
6.4 Physics and Observational Manifestations of Black Holes 153
Fig. 6.32 Black hole candidates in binary systems. All systems have inferred masses higher than
the Rhoades–Ruffini limit (blue vertical line), allowing their identification. Note the absence of
low-mass candidates, and the maximum value of order 15–20M for galactic objects. From [54]
Fig. 6.33 The extragalactic binary M33 X-7 in X-rays. On the left, normal binary emission. On the
right, the eclipse through the disk (it does not fully zero because something from the X radiation
spreads out of the disc). Note that, to reproduce the light curve, the semi-axis of the orbit is less
than 1/2 that of the orbit of Mercury [55]. Credit: NASA/CXC/CFA/P. Plucinsky et al. (X-ray),
NASA/STScI/SDSU/J. Orosz et al. (optical)/Science Photo Library
This makes it possible to determine with great accuracy the inclination sin i in (6.38),
and with it the mass of the primary dark object, giving a result 15.65 ± 1.45M . Thus,
this result identifies the primary as a black hole. This corresponds to the last line in
Fig. 6.32 and is one of the most reliable determinations, thanks to the presence of
eclipses.
There are other ways to determine the presence of black holes in systems that
go beyond the use of Kepler’s third law. For example, several X-ray binaries have
X-outs, where the “hard” photon count rises very quickly and decreases in a few
154 6 Astrophysics of Compact Objects
Fig. 6.34 X-ray outbursts as a test for the presence of black holes. Left: Typical burst, where the
count increases by at least a factor of 10 and returns to the initial state after about 2 minutes. Right:
L–Porb diagram for a set of sources. The separation into two groups of different luminosity is
evident, and happens independently from Porb . Those at the bottom are identified as black holes
[57, Fig. 3]. © AAS. Reproduced with permission
minutes (Fig. 6.34 left). The most widely accepted interpretation is that hydrogen
accumulates on the surface of the star until it reaches the density and temperature
of thermonuclear ignition. X-ray binaries can be analyzed in various ways, but one
revealing diagram is that of luminosity vs. orbital period. When placed in this plane,
a gap is observed between two groups, one more luminous and one much less lumi-
nous in the stationary states. The key observation is that only the brightest group
(Fig. 6.34 top right) presents X-outbursts, while the least bright group never does.
This has led to the interpretation that, since the outbursts need the existence of a
surface to accumulate hydrogen, the brightest group contain neutron stars, while the
less luminous group was composed of black holes that had no surface and no out-
bursts could occur. This hypothesis would also explain the difference in stationary
luminosity: the innermost part of the accretion disk would fall within the event hori-
zon and that is precisely where most of the energy is radiated [56]. We see here how
tests can be proposed to discover the presence of black holes in real stellar systems.
Besides the class of stellar mass black holes, we now have evidence for the pres-
ence in the Universe of supermassive black holes, with masses above 106 M . This
class was not considered before the 1960s, and it was precisely the discovery of
quasars that led to its study. These developments will be analyzed in Chap. 8, but we
will discuss one of these cosmic monsters briefly here, of particular interest because
it is located in the center of our own galaxy.
The study of the center of our galaxy is not at all easy, since the region is strongly
obscured by dust and gas. There are some “windows” of low extinction in which
observation is somewhat easier, and of course infrared wavelengths and radio waves
can be used for this research. With the accumulation of information over time, it
became clear that the central parsec contains a very interesting stellar population, in
addition to compact objects and supernova remnants. Studies of the motion of the
6.4 Physics and Observational Manifestations of Black Holes 155
most central stars attested to the presence of a compact supermassive object using
the method from Kepler’s third law already explained. Figure 6.35 shows the orbits
of two particularly interesting stars (S0-2 and SO-102). These have relatively short
periods and were well determined after more than 15 years of observations [58]. The
direct application of Kepler’s law shows that there is an object of mass calculated to
be 3.5 × 106 M sitting almost at the focus of the ellipses (marked with an arrow).
There is no signal at any wavelength that reveals the presence of this object, and for
this reason it is believed to be a giant black hole. There are other proposals that are
more exotic than the black hole, but as there are known to be supermassive black holes
in a huge number of galaxies, there is a strong consensus in favor of this hypothesis
based on kinematic observations.
These objects in galactic centers are often active, giving rise to the so-called AGNs
(Chap. 8). But of course the center of the Milky Way is not an active galactic nucleus.
This is because the fall of matter onto the central object is only sporadic, in contrast
with its cosmological relatives. However, it has been possible to directly imagine
156 6 Astrophysics of Compact Objects
some centers of outer galaxies and to check that black holes are present, in some
cases even forming multiple systems (Fig. 6.36).
Other cases where effects directly associated with the presence of a supermassive
object have been observed include the so-called Kα iron line, identified with the
fluorescence of the K-layer. This emission line is present in X-rays, with an energy of
6.4 keV. Its profile has been studied and is well known. But in at least two extragalactic
systems, the line is strongly asymmetrically distorted (Fig. 6.37). The interpretation
is that the emitting material is accreting onto a large mass black hole, so that the
gravitational “pull” towards the center appears clearly in the spectrum. It is important
to point out that, despite several campaigns to look for more examples, these are
scarce. It is not clear why there should be this rarity, since the accretion phenomenon
is very common and must be associated with galactic activity (Chap. 8).
Finally, we will discuss a class of sources discovered in 1994 by L.F. Rodríguez
(UNAM, México) and F. Mirabel (IAFE, Argentina) that brought a completely dif-
ferent perspective on the nearest black holes. Mirabel and Rodríguez observed Cyg
X-1, an object that showed the presence of relativistic jets and radio lobes similar
to those observed in the AGNs (Fig. 6.38 left), but on a much smaller scale. The
Fig. 6.39 Quasar and microquasar. Left: Comparison identifying the main elements of each system.
Credit: I.F. Mirabel. Right: Biological analogy (not to scale) that emphasizes the identity of the
structure and processes, but on very different scales
distance inferred for Cyg X-1 is about 8 kpc. Soon after, the same researchers were
able to show that a second source (GRS 1915) has so-called superluminal jet motion
(Fig. 6.38 right), in which the structures of the jet seem to move away at speeds
greater than c due to a projection effect [60]. It was thus demonstrated that the jets
were relativistic and that, in general, objects of stellar mass (GRS 1915 contains
a black hole of estimated mass around 33M ) behave to a large extent like their
gigantic “cousins” in the AGNs. Hence the name microquasars, with which they are
still known today.
Figure 6.39 compares the morphologies of quasars and microquasars. While in
the case of AGNs the accretion is thought to be “fossil”, from the surrounding envi-
ronment, microquasars are fed by a donor companion. It is important to point out
that, due to the difference in spatial and temporal scales, we can see phenomena that
would otherwise be very slow or distant. For example, the superluminal expansion
analogous to Fig. 6.38 takes many years in the case of an AGN. Thus we have nearby
systems that behave very much like the more remote ones, and hence the possibility
of learning more about the central objects that are identified with black holes.
References
1. E.E. Salpeter, The luminosity function and Stellar Evolution. Astrophys. J. 121, 161 (1955)
2. https://www.wikiwand.com/en/Initial_mass_function
3. S.E. Woosley, A. Heger, T.A. Weaver, The evolution and explosion of massive stars. Rev. Mod.
Phys. 74, 1015 (2002)
158 6 Astrophysics of Compact Objects
4. S. Shapiro, S.L. Teukolsky, Black Holes, White Dwarfs and Neutron Stars: The Physics of
Compact Objects (Wiley-VCH, Weinheim, 1991)
5. https://compstar.uni-frankfurt.de/outreach/short-articles/the-hyperon-puzzle/
6. J.B. Holberg, The discovery of the existence of white dwarf stars: 1862 to 1930. J. Hist. Astron.
40, 137 (2009)
7. R.H. Fowler, On dense matter. MNRAS 87, 114 (1926)
8. S. Chandrasekhar, An Introduction to the Study of Stellar Structure (Dover, New York, 2010)
9. R.F. Tooper, Stability of massive stars in General Relativity. Astrophys. J. 140, 434 (1964)
10. S.O. Kepler et al., White dwarf stars. Int. J. Mod. Phys. Conf. Series 45, 1760023 (2017)
11. S.O. Kepler, D. Koester, G. Ourique, A white dwarf with an oxygen atmosphere. Science 352,
6281 (2016)
12. I.-S. Suh, G. Mathews, Mass-radius relation for magnetic white dwarfs. Astrophys. J. 530, 949
(2000). https://doi.org/10.1086/308403
13. E.E. Giorgi et al., NGC 2571: An intermediate-age open cluster with a white dwarf candidate.
Astron. Astrophys. 381, 884 (2002)
14. S.O. Kepler et al., White dwarf mass distribution in the SDSS. MNRAS 375, 1315 (2007)
15. L. Mestel, On the theory of white dwarf stars. I. The energy sources of white dwarfs. MNRAS
112, 583 (1952)
16. H.B. Richer et al., White dwarfs in globular clusters: Hubble Space Telescope observations of
M4. Astrophys. J. 484, 741 (1997). https://doi.org/10.1086/304379
17. A. Kanaan et al., Whole Earth Telescope observations of BPM 37093: A seismological test of
crystallization theory in white dwarfs. Astron. Astrophys. 432, 219 (2005)
18. L.D. Landau, On the theory of stars. Phys. Z. Sowjet. 1, 285 (1932)
19. D.G. Yakovlev et al., Lev Landau and the conception of neutron stars. Phys. Uspekhi 56, 289
(2013)
20. W. Baade and F. Zwicky, op. cit. Chap. 5
21. R. Oppenheimer, G. Volkoff, On massive neutron cores. Phys. Rev. 55, 374 (1939)
22. B.F. Schutz, A First Course in General Relativity (Cambridge University Press, Cambridge,
UK, 2009)
23. R.C. Tolman, Relativity, Thermodynamics and Cosmology (Dover, New York, 2011)
24. M.S. Delgaty, K. Lake, Physical acceptability of isolated, static, spherically symmetric, perfect
fluid solutions of Einstein’s equations. Comp. Phys. Comm. 115, 395 (1998)
25. G. Baym, H.A. Bethe, C. Pethick, Neutron star matter. Nucl. Phys. A 175, 225 (1971)
26. H.A. Bethe, M. Johnson, Dense baryon matter calculations with realistic potentials. Nucl. Phys.
A 230, 1 (1974)
27. F. Weber, Pulsars as Laboratories for Nuclear and Particle Physics (IoP Publishing, Bristol,
UK, 1999)
28. N.U. Bastian et al., Towards a unified quark-hadron-matter equation of state for applications
in Astrophysics and heavy-ion collisions. Universe 4, 67 (2018)
29. C.E. Rhoads, R. Ruffini, Maximum mass of a neutron star. Phys. Rev. Lett. 32, 324 (1974)
30. A.Y. Potekhin, W.C.G. Ho, G. Chabriers, Atmospheres and radiating surfaces of neu-
tron stars with strong magnetic fields. Proceedings of Science (MPCS2015). Preprint at
arXiv:1605.01281
31. P.B. Demorest et al., A two-solar-mass neutron star measured using Shapiro delay. Nature 467,
1081 (2010)
32. arXiv:2011.08157
33. R. Valentim, E. Rangel, J.E. Horvath, On the mass distribution of neutron stars. MNRAS 414,
1427 (2011)
34. M.C. Miller et al., PSR J0030+0451 mass and radius from NICER data and implications for
the properties of neutron star matter. Astrophys. J. Lett. 887, L24 (2019)
35. A. Hewish et al., Observation of a rapidly pulsating radio source. Nature 217, 709 (1968)
36. T. Gold, Rotating neutron stars as the origin of the pulsating radio sources. Nature 218, 731
(1968)
37. F. Pacini, Rotating neutron stars, pulsars and supernova remnants. Nature 219, 145 (1968)
References 159
38. F.C. Michel, Theory of Neutron Star Magnetospheres (University Chicago Press, Chicago,
1990)
39. P.M. Woods, C. Thompson, Compact Stellar X-ray Sources (Cambridge University Press,
Cambridge, UK, 2006)
40. B. Davies et al., The progenitor mass of the magnetar SGR1900+14. Astrophys. J. 707, 844
(2009)
41. G.S. Bisnovatyi-Kogan, B.V. Komberg, Pulsars and close binary systems. Sov. Astron. 18, 217
(1974)
42. K. Hurley et al., An exceptionally bright flare from SGR 1806–20 and the origins of short-
duration gamma-ray bursts. Nature 434, 1098 (2005)
43. J.E. Grindlay et al., High-resolution X-ray imaging of a globular cluster core: Compact binaries
in 47Tuc. Science 292, 2290 (2001)
44. C. Montgomery, W. Orchiston, I. Whittingham, Michell, Laplace and the origin of the black
hole concept. Jour. Astron. Hist. Heritage 12, 90 (2009)
45. V.M. Kaspi, Grand unification of neutron stars. PNAS 107(16), 7147–7152 (2010)
46. V.M. Kaspi, Diversity in Neutron Stars: X-Ray Observations of High-Magnetic-Field Radio
Pulsars (American Astronomical Society, 2011)
47. B. Carter, Half century of black-hole theory: from physicists’ purgatory to mathematicians’
paradise, in AIP Conference Series, Vol. 841, Eds L. Mornas and J. Diaz Alonso, p. 29.
arXiv:gr-qc/0604064
48. S.W. Hawking, Black hole explosions? Nature 248, 30 (1974)
49. P.S. Custódio, J.E. Horvath, Evolution of a primordial black hole population. Phys. Rev. D 58,
023504 (1998)
50. B. Carr, F. Kuhnel, M. Sandstad, Primordial black holes as dark matter. Phys. Rev. D 94, 083504
(2016)
51. P.S. Custódio, J.E. Horvath, The evolution of primordial black hole masses in the radiation-
dominated era. Gen. Rel. Grav. 34, 1895 (2002)
52. https://eventhorizontelescope.org/
53. The Event Horizon Telescope Collaboration, First M87 Event Horizon Telescope results. I.
The shadow of the supermassive black hole. Astrophys. J. Lett. 875, L1 (2019)
54. https://stellarcollapse.org/bhmasses
55. J.A. Orosz et al., A 15.65-solar-mass black hole in an eclipsing binary in the nearby spiral
galaxy M 33. Nature 449, 872 (2007)
56. R. Narayan, J. Heyl, On the lack of type I X-ray bursts in black hole X-ray binaries: Evidence
for the event horizon? Astrophys. J. Lett. 574, L139 (2002)
57. R. Narayan, Black holes in Astrophysics. New J. Phys. 7, 199 (2005). https://doi.org/10.1088/
1367-2630/7/1/199
58. A.M. Ghez et al., Measuring distance and properties of the Milky Way’s central supermassive
black hole with stellar orbits. Astrophys. J. 689, 1044 (2008)
59. Y. Tanaka et al., Gravitationally redshifted emission implying an accretion disk and massive
black hole in the active galaxy MCG-6-30-15. Nature 375, 659 (1995)
60. I.F. Mirabel, L.F. Rodríguez, A superluminal source in the Galaxy. Nature 371, 46 (1994)
Chapter 7
Accretion in Astrophysics
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 161
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_7
162 7 Accretion in Astrophysics
where the last term is the “centrifugal” term induced by rotation. Thus, using the
relationships arising from Kepler’s law we can make the potential dimensionless by
replacing x → x/a to give
2
2 2 q
eff (x, y, z) = − − − x− + y2 , (7.2)
(1 + q)r1 (1 + q)r2 1+q
which is independent of the individual masses and orbit size, since q is the only
parameter that appears here. The solution of any problem involving (7.2) will be
universal, because there is no reference to the dimensions in it. Once solved in terms
of dimensionless quantities, it will be enough to restore the necessary dimensions
for the specific problem at hand.
A cross-section of the potential is shown in Fig. 7.2. When placed in this potential,
any particles with the energy to reach the “top” will inevitably fall to the other side.
This means that when the deformation of the star is too big, matter (gas) will escape
from it (M2 ) and fall into the potential well of the other. Stellar Evolution will
naturally place it in this secondary M2 situation, and thus trigger the mass transfer.
A complementary perspective on this problem is given by calculating the points
where a particle is subject to a balance of forces, i.e., the solutions of ∇eff = 0. In
7.1 Roche’s Problem 163
the (x, y) plane, the points that satisfy this equation are called Lagrangian points.
The most important for our discussion is L 1 , where the gravitational pull of the two
masses is evident. But the other Lagrangian points are also relevant in practice. For
example, the WMAP satellite mission and the Herschel Observatory were placed
in orbits corresponding to point L 2 of the Sun–Earth system in order to maintain
their orientation towards the Sun and to observe pointing in the opposite direction.
Points L 4 and L 5 are a direct consequence of the rotation in the effective potential,
without which there would be no forces to compensate gravitation. The so-called
Trojan asteroids in the Sun–Jupiter system orbit at precisely these points.
In general, it is the joint existence of Lagrangian points and Stellar Evolution that
leads to the phenomenon of accretion. We have already seen in Chap. 4 that solar type
stars “swell” considerably when they leave the Main Sequence and need to conserve
Virial equilibrium and energy simultaneously. Thus, the gas in the atmosphere of the
star at some point reaches L 1 and one says that the secondary “fills its Roche lobe”.
This is the moment where accretion begins in the binary system.
Of course, this is only possible if the deformation of the secondary star is very
large. In fact, the atmosphere fills an equipotential like those marked in light blue in
Fig. 7.3. The exact calculation of the shape of these equipotentials is difficult, but
there is a numerical fit due to Eggleton [2] for the quotient of the Roche lobe RL and
the semi-axis a as a function of the mass asymmetry q which turns out to be very
precise in any situation and is used for a variety of calculations:
RL 0.49 a 2/3
≈ . (7.3)
a 0.69 a 2/3 + ln(1 + q 1/3 )
non-degenerate star), there are several possibilities for filling. Thus, binaries can be
classified as follows by the way they fill their respective Roche lobes (Fig. 7.5):
1. Separated (detached). Neither member fills its Roche lobe. This type of binary is
frequent for pairs with a long period, but can be converted to another type later
depending on the evolution of the members.
2. Semi-detached. The secondary fills its Roche lobe and there is transfer of mass
from it to the primary. Typical case of a binary with a compact object as primary.
The prototype is Algol (β Persei), one of the first variable stars identified in the
history of Astronomy.
3. Contact. The two stars fill their Roche lobes and the gas from one or the other
flows through point L 1 . The prototypical example is W Ursa Majoris.
4. Common envelope. The two stars overflow their respective Roche lobes and thus
share the atmosphere. This type of binary has a very short duration, a few years
at most, before expelling the envelope or merging.
At this point it is instructive to take a look at Fig. 7.13, where the secondary Roche
lobes in the systems containing a black hole are shown to scale. The physical
dimensions of the disks are also important because they determine the existence
of eclipses, as we discussed in the case of M33 X7.
7.2 Spherical Mass Accretion and Accretion Disks 165
In general, the fall of matter on the primary leads to one of the most common
phenomena in Astrophysics: the so-called accretion disks. It is not so obvious that
matter should form a disk. In fact, it could fall in an isotropic (spherical) manner. But
the main factor determining each possibility is the angular momentum of the gas in the
potential: spherically symmetrical accretion requires the gas to have zero total angular
momentum, which is very unlikely to happen. With this caveat, we first present the
elementary treatment of spherical accretion (zero total angular momentum) before
discussing the disks.
In the simplest form of spherical accretion, called Bondi–Hoyle accretion, an
object of mass M travels through the interstellar medium with a velocity v, in a
subsonic (v < cs ) or supersonic (v > cs ) way. The isotropic fall of particles onto the
object causes an addition of mass that obeys the simple expression
Ṁ ≈ π R 2 ρv , (7.4)
where the right-hand side is just the spherical flux multiplied by the geometrical cross-
section π R 2 . Equation (7.4) uses the speed of the object v if the motion is supersonic,
while the speed of sound cs must be inserted in place of v if it is subsonic. Note that
the radius R in (7.4) is not exactly the radius of the object, but rather an effective
radius (the Bondi
√ radius), determined by the escape condition for the speed of sound
cs , given by (2G M/R) = cs . This expression thus defines the Bondi radius by
RB = 2G M/cs2 , and this can be substituted into (7.4) to obtain
πρG 2 M 2
Ṁ ≈ . (7.5)
cs3
This formula shows that the higher the speed of the object, the less mass will be
accreted, besides depending quadratically on the central mass. The full problem is
much more complicated and needs a detailed treatment in the neighborhood of the
object where the accreted gas suffers from discontinuities in density that appear
because the supersonic flow has to “fit” the boundary conditions, but we will not
address these complications in our discussion.
The more general case of non-spherical accretion begins with the observation that,
when the angular momentum is not zero, its (approximate) conservation will cause a
flattening of the gas flow near the attracting mass M. As the gas must be able to cool
down faster than it loses angular momentum (this is a precondition for it to fall), a
flattened rotating structure is formed which we call a disk [3].
The example shown in Fig. 7.6 is the disk of the object Herbig–Haro 30. These
systems correspond to stars that are just forming and where the central core is
accreting matter from the cloud that builds the star. The disk is in the equatorial
plane, perpendicular to the proto-stellar jets, as expected.
166 7 Accretion in Astrophysics
Being a little more specific, we can study when the disk will form in terms of
the angular momentum J , equating the gravitational force to the centrifugal force
resulting from it:
G Mm
= mω2 R . (7.6)
R2
conclusion follows from the fact that, as λmol ≈ 10−2 cm and the speed of sound
in the medium is cs ≈ 106 cm s−1 , the characteristic time for changes in the disk
should be
R2
τmol = ∼ 108 yr . (7.7)
νmol
On the other hand, observations of real systems with accretion disks show important
variations of the order of weeks (!). Thus, it is very likely that there are viscosity
sources orders of magnitude higher acting in the disk, in such a way as to produce
these variations. At the moment, the exact physical origin of this viscosity is quite
unknown.
To solve this problem and construct a model of the disk, Shakura and Syunyaev
[4] suggested dealing with the low value of the molecular viscosity as follows: a low
viscosity νmol will necessarily cause the regime change from laminar to turbulent gas
flow. The reason for this is that the Reynolds number Re = vL/νmol ≈ 1010 is huge,
much higher than the threshold value to consider the gas flow as laminar. Thus,
they deduce that the main effect on the disk would be due to turbulent viscosity
νturb . However, the theory of turbulence is complicated and the viscosity cannot
be calculated from first principles. Thus, Shakura and Syunyaev [4] parametrize
the turbulent viscosity as νturb = αcs H , where H is a height scale for the pressure
in the disk, i.e., the distance at which the pressure in the transverse direction drops
appreciably. Using the hydrostatic equilibrium equations, continuity equation, energy
balance, etc., giving 6 differential and algebraic equations in all, the disk problem
can be solved with α ∼ 1 as a parameter. The physically important hypotheses for
these so-called α-disks are:
• The gravitational field is due to the central object alone, i.e., the disk is not “self-
gravitating”.
• The disk is geometrically thin, but optically thick.
• Hydrostatic balance determines the vertical structure, and in particular, the height
scale H .
• There are no external winds or torques on the axisymmetric disk.
Such α-disks are widely used in the absence of a more precise model for the sources
of turbulence and other effects. In fact, the initial proposal was replaced by the
hypothesis that the viscosity is proportional to the gas pressure, since if it were
168 7 Accretion in Astrophysics
proportional to the total pressure, which includes radiation, the situation would be
unstable.
Before proceeding, it is important to note that accretion cannot be arbitrarily
intense, since the matter being accreted feels the pressure of the emitted radiation.
If the radiation pressure that results from the accretion itself becomes too high, the
accretion stops. There is therefore a maximum luminosity for any object that accretes
mass, which can be obtained by equating the product of the radiation pressure and
the (Thompson) cross-section σT with the inward “pull” of the gravitation, i.e.,
dp L
= σT × . (7.8)
dt 4π cr 2
known as the Eddington luminosity, the maximum possible value allowing the disk
to remain bound. As a corollary of this idea, we see that every explosive phenomenon
(for example, X-bursts, etc.) must be super-Eddington [5].
As listed above, the second hypothesis put forward by Shakura and Syunyaev,
that the disk should be optically thick, implies that it will emit as a whole as a black
body, with peak temperature given by L disk = 4π R 2 Tpeak 4
. If we suppose that this
temperature corresponds to the matter emitting in the last stable orbit with R = 3RS
(Chap. 6), and that the maximum luminosity is the Eddington luminosity L E , we can
invert to find Tpeak as a function of the mass of the central object:
−1/4
M
Tpeak = 2 × 10 7
K. (7.10)
M
Thus, we see that the highest temperatures are reached for the less massive objects.
For example, a supermassive black hole disk will emit in the UV, but a microquasar
disk will do so mainly in X-rays. Note that this statement applies only to the emission
of the disk, while other components such as a corona and jets may be present and
add to the total radiation emitted in other bands.
With the advances in the study of the fundamental Physics of accretion it
became clear that the viscosity can be attributed to the presence of magnetic fields.
Magnetized disks are subject to magnetohydrodynamic (MHD) instabilities that
may be responsible for the observed changes in accretion state, like those shown
in Fig. 7.8.
One important feature that is known to be present in compact objects such as
neutron stars is a high magnetic field. The magnetic field of the central object is
expected to affect the accretion itself, because at some point the magnetic pressure
would oppose the further fall of matter. Assuming a dipolar structure for the field
B ∝ (R/r )3 B0 , where B0 is the intensity at the surface r = R, this magnetic pressure
7.2 Spherical Mass Accretion and Accretion Disks 169
Fig. 7.8 Left: Changes in the accretion disk observed on weekly time scales. The disk goes from
emitting as a black body to a state where the emission is a power law, with alteration to its structure.
Right: Calculated curves due to Pessah, Chan, and Psaltis, taking into account turbulent MHD
instabilities, compared with the Shakura–Syunyaev parametrization (red line). These differences
are surely what lies behind the sudden changes observed. Credit: Fig. 1 of [6]
reads 6
B2 R
Pmag = = B02 . (7.11)
8π r
At some point near the accreting object, the magnetic pressure will dominate and
the matter will obey the dynamics imposed by the spatial structure of the magnetic
field. This point is called the Alfvén radius, given by
1/7 1/7
8π 2 R 12 B04
rA = . (7.13)
G M Ṁ 2
Since the central object is generally rotating, there is another relevant quantity for the
accretion onto a magnetized compact star, namely the corotation radius, denoted by
rco , the locus of points at which the corotation is at the maximum possible speed c :
1/3
GM
rco = . (7.14)
This has to be compared to the Alfvén radius to gauge the effects of the magnetic field
and rotation together: if rA < rco the matter accretes onto the object, but if rA rco ,
the matter will be ejected. This last situation is called a propeller, when the mass
170 7 Accretion in Astrophysics
gain by the central object is zero. It should be noted that, in the case of accretion,
the accreted matter will be funneled to the magnetic poles at distances between the
Alfvén radius and the star surface R. Since the magnetic poles in rotating objects are
never aligned with the rotation axis, we should observed a periodic modulation of
(bremsstrahlung) X-rays with the rotation period of the central object. This is exactly
what happens is many X-ray pulsars.
Our discussion above has been quite general and most of it can be applied to any
accreting binary. However, as we have a special interest in binaries where at least one
of the members (the primary) is a compact object, we can use the previous concepts
to classify and understand observations. We will now carry out a brief overview of
these systems.
This class of variables containing a white dwarf has been known for several centuries,
originally through the events known as novas, stars that greatly increase their usual
magnitude and then return to their previous state. These events consist of outbursts
in which the brightness of the system can increase by 10 magnitudes or more. It was
later found that this class contains several subgroups consisting of a semi-detached
binary where M1 is a white dwarf.
The mechanism by which the nova produces an optical outburst is now well
accepted: the accreted hydrogen accumulates on the surface and reaches an ignition
condition, producing the event. Different systems are characterized by different
secondaries, WD magnetization, etc. Note that, if the WD retains a fraction of the
accreted mass, it may become an SNIa progenitor if it approaches the Chandrasekhar
limit. This is in fact the proposed fate of V4444 Sgr, shown in Fig. 7.9.
The CV zoo contains several interesting species that can reveal details of the
accretion process and its consequences, e.g., the outbursts, but we will not discuss
them here. We recommend the article by R. Connon Smith [7] for an in-depth
discussion.
7.3 Binaries Containing Compact Objects: Observations and Classification 171
i.e., the kinetic energy of the accreted gas is converted into radiation and heat on
or near the surface of the compact object. We are not considering the possibility
of accretion disk states that are optically thin, where the estimate (7.15) is not
valid. Note that the accumulation of gas, a necessary condition for thermonuclear
outbursts, is possible only if there is a surface, so black holes could not have outbursts
(Chap. 6). There are several striking differences between LMXB and HMXB, as noted
in Table 7.1 [5].
On general grounds we can say that the nature of the secondary star mainly
determines the type of binary, while the intense magnetic field of the neutron star
causes the gas to fall essentially through the poles, directing the flow (and justifying
the presence of X-ray pulsations resulting from stellar rotation); the weak field of the
LMXB does not favor pulsations (Fig. 7.10), but the accretion flow presents a series
of quasi-periodic oscillations (QPO) in the spectrum that reveal important signatures
of the inner regions of the disk, and possibly of the compact star itself, a topic of
great interest to the community.
An extreme example of systems that started as a type of LMXB but have reached
a very advanced stage of evolution are the so-called black widows (Fig. 7.11). The
172 7 Accretion in Astrophysics
discovery of these systems [8] immediately led to the construction of a model where
the observed eclipses were associated with the evaporation of the companion by
the pulsar wind. This wind-stripped matter is easily visible in images, giving the
evaporating star the aspect of a comet—and in fact, this is exactly what happens
with cometary nuclei when they approach the Sun, only on a much higher energy
scale. The name “black widow” signals the fact that it is the very star that is being
evaporated that was responsible for re-energizing the pulsar in a previous phase. The
current mass of the companion in this and similar systems is 10−2 –10−3 M , and will
possibly be completely evaporated in the future. Years after the original discovery, an
Australian group found similar binary systems, but where the mass of the companion
was of order 0.1M , and named them redbacks (an Australian spider related to the
black widow species). The evolutionary explanation for these systems appeals to
X-ray backlighting in the redback phase and the evolution of the secondary until it
later reaches the condition of degeneracy: the degenerate matter expands if the mass
is ablated by the wind, so the object is subject to further evaporation. This makes
some redbacks the progenitors of black widow systems, with a unique evolution
combining backlighting and wind in systems that reach orbits with periods of a few
hours [9]. These groups are an area of intense research to understand and confirm
their structure and evolution.
7.3 Binaries Containing Compact Objects: Observations and Classification 173
Fig. 7.12 The kind of spectral changes seen in some X-ray sources, e.g., GRO J1655-40. Left: A
so-called soft-high state dominated by thermal emission from the disk. Right: A hard-low state with
a non-thermal component extending typically above 100 keV. These changes go back and forth with
frequency in the same source and are thought to reflect changes in the disk
here how important it is to know the state of the disk, both for its own sake and when
we use it to diagnose the central object.
Figure 7.13 shows to scale the binary systems in which a black hole is considered to
be present [12], usually because the estimated mass from Kepler’s third law exceeds
the Rhoades–Ruffini limit given in (6.37). Due to the different parameters of binary
formation and the various evolutionary states of the companions, black holes that
emit X-rays form a particularly interesting set when we look at them on the same
scale as in Fig. 7.13, where the various systems have been classified according to the
color of the companion star (obtained from the effective temperature), the size of the
measured orbit, and the observed scale of the accretion disk.
7.3 Binaries Containing Compact Objects: Observations and Classification 175
References
As we saw in Chap. 2, the launch of the first satellites carrying X-ray detectors
into space in the 1960s was a game-changer in high energy Astrophysics. Added to
the prodigious development of radio telescopes, a flood of completely unexpected
discoveries revealed the existence of large amounts of energy emitted by objects
with a stellar appearance, but whose extragalactic origin was eventually established.
These were thus called quasi-stellar objects (QSO) or quasars. The first of these
objects, the most famous example of a quasar, called 3C 48, was discovered in the
1960s. Initially, it could not be classified on the basis of its observed spectrum. A
general discussion of the AGN types and evolution is given here. A series of rather
broad lines in positions that did not correspond to known elements left astronomers
puzzled. Soon after, similar objects were found. In fact, the original name “quasar”
already shows how difficult it was to understand these objects, since they look like
any ordinary star to an optical telescope.
However, an in-depth study of their spectral lines, produced by atomic absorption
and emission, which removes or adds photons in certain regions of the wavelength
range, showed that these lines were actually due to known transitions of ordinary
chemical elements. It was just that they were located in very different positions
from the usual ones, being uniformly shifted toward the red region of the spectrum
(Fig. 8.1). This uniform displacement was soon attributed to the expansion of the
Universe, which in turn implied that quasars were located at cosmological distances,
with the emitter gas corresponding to some type of galaxy. The work that allowed this
conclusion had begun 50 years earlier when V. Slipher measured the radial velocities
of several “nebulae” [1]. It was later shown that the “nebulae” were actually giant
stellar systems (galaxies), and that the expansion of the Universe would cause what
appeared to be a “Doppler effect” between them and the Earth due to their relative
velocity v. This was thus observed as a uniform displacement of the lines from the
laboratory position λ0 to the observed position λ according to [2]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 177
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_8
178 8 Active Galactic Nuclei (AGNs)
λ − λ0 v
z= = . (8.1)
λ0 c
Fig. 8.2 Left: The quasar HE1239-2426 in the center of the host galaxy, revealed in high-resolution
images. Right: Example of extragalactic jets in the quasar 3C 175. Credit: J. Bahcall (IAS, Princeton),
M. Disney (Univ. Wales), NASA
The question that initially motivated these studies was: what makes QSOs visible?
In other words, what is their energy source? After a long scientific controversy,
working through all possible hypotheses, including those that made black holes
responsible for the energy emission, the cosmological origin of quasars was finally
established when a supernova was observed in one of them. Thus, it was proven that
each quasar resided in some kind of galaxy in which there were also normal stars such
as the progenitor of the supernova that exploded. Today we can see some of these
galaxies in deep images like those in Fig. 8.2. Naturally, the stars in these galaxies
were not visible individually because of the enormous distances involved. Relativistic
particle jets that emerge from these regions and propagate through many kpc were
also observed in AGNs as an additional product of mass accretion onto the black hole
(Fig. 8.2). They are formed by matter that escapes perpendicular to the plane of the
accretion disk in the neighborhoods of the central black hole, as discussed in the last
Chapter. Everything pointed to the presence of the largest accreting systems in the
entire Universe. It is the accretion of gas and stars by supermassive black holes that
makes us see a quasar, while most of the time the host galaxy is invisible because it
is very faint at typical distances [3].
To establish that the quasar energy source is not the annihilation of matter with
antimatter as was initially thought, but rather the accretion of gas onto a gigantic
black hole, it was important to estimate the mass of the black hole. Let us see how
this is done in the case of AGNs. The luminosity produced by matter falling at a rate
Ṁ is
G M Ṁ
L= = 2πr 2 σ T 4 , (8.2)
2r
where the factor of two arises from the conversion of energy according to the Virial
theorem (4.27), energy which is then radiated by both faces of the disk behaving as
a black body. Equation (8.2) can be inverted to obtain
180 8 Active Galactic Nuclei (AGNs)
Fig. 8.3 The galaxy Cygnus A in the optical band (left) and X-ray/radio (right) bands. Although
to conventional telescopes there seems to be nothing special about it, when investigated in other
bands both the X-ray emission (blue) and the emerging jets (red) place it firmly in the active galaxy
class. Credits: X-ray NASA/CXC/SAO; visible light NASA/STScI; radio NSF/NRAO/AUI/VLA
1/4 −3/4
T ∝ M Ṁ r , (8.3)
Recalling that matter follows spiral paths until it suddenly falls when it reaches the
radius of the innermost stable circular orbit (ISCO), which is 3RS if we do not take
into account the spin of the black hole, we come to the conclusion that the hardest
radiation comes from the inner edge of the disk, while the optical and infrared
signals are produced in the regions farthest out, where the temperature has dropped
sufficiently. This applies to the disk radiation, but we have already seen that the radio
emission is almost certainly due to the synchrotron provided by the jet electrons
(although it is not totally clear where the magnetic field B that makes it possible
comes from). This radio emission is also closely related to the harder emission in
gamma rays, since the very energetic electrons from the jets collide with photons and
transfer their energy to them by the inverse Compton process discussed in Chap. 2.
But note that the exact location of these “soft” photons is still a subject of debate.
In any case, to reproduce the observed luminosities and assuming that the emission
occurs at Eddington’s maximum rate, (8.2) with Ṁ = ṀEddington indicates that the
central objects must have a mass M of several millions or even billions of solar
masses in some cases. This means that we are dealing with the black holes described
as supermassive, already located in the Carter diagram Fig. 6.26, and totally unrelated
to the Stellar Evolution process.
8.1 Discovery of Quasars 181
It is not always easy to identify a quasar with an active galaxy or AGN. There
are several types of active galaxy that appear to be normal in optical imaging, just
a galaxy like any other. But when examined in radio, X-rays, or gamma rays, some
of these emissions can be much larger than one would see from a normal galaxy.
That is, some kind of high-energy process is active in the galaxy, without necessarily
appearing in the optical bands. One of the best-studied examples is shown in Fig. 8.3.
With the appearance and classification of several types of AGN exhibiting different
emissions, the idea gradually emerged that they might actually have the same basic
structure, but observed from different angles due to their random orientations relative
to the line of sight. The construction of this unified model of AGNs took into account
the existence of a supermassive black hole, together with matter in an accretion disk,
and also abundant dust (detected) that would obscure the system in the equatorial
plane. Gas clouds in the inner and outer region were postulated to explain the widths
of the lines (like those in Fig. 8.1). Narrow lines should originate far from the central
region, where the clouds move at low speeds, while broad lines should be produced
in closer regions where the clouds move in short orbits and hence more quickly.
Depending on the inclination of the object’s “equatorial” plane with respect to the line
of sight, we will see different aspects of the active galaxy. In this way, the variety of
types found was unified by astronomers, and generically called AGNs, encompassing
quasars and other “cousins” of a similar nature (such as Seyfert galaxies, blazars,
etc.), but which would ultimately turn out to be the same type of object [4].
Figure 8.4 displays the various types of AGN that needed to be explained. The
basic classification begins with the observation of radio emission, which can be
weak/null (the AGN is said to be radio-quiet) or significant (the AGN is said to
be radio-loud). The presence or absence of wide or narrow lines also divides the
AGNs into categories I and II, with a category 0 indicating the absence of both types
(although there may be absorption lines). The Unified Model thus has to contain
regions where narrow lines and wide lines are produced, and it has to explain why
they appear or don’t appear in the various groups.
The physical structures that were postulated to explain each observed characteris-
tic are visualised in Fig. 8.5. Basically, the black hole-accretion disk (fossil) system
occupies the central region or “engine” of the AGN, but is surrounded by a torus of
gas and dust that substantially obscures the system in the equatorial plane, and clouds
of gas in the perpendicular direction, in addition to jets that are mainly responsible for
the radio emission. (Note that the torus of gas and dust should not be confused with
the accretion disk itself, which is almost pointlike on the scale at which the figure
is drawn.) According to this Unified Model, the division between “loud” or “quiet”
then depends on the jets and the angle between the jet axis and the line of sight. A
representation of what is observed in each case for different angles of observation is
shown in Fig. 8.5.
182 8 Active Galactic Nuclei (AGNs)
Fig. 8.5 Different structures featuring in the Unified Model of AGNs, and the result of observing
them along different lines of sight [5, Fig. 1]
We should point out that, in addition to the often observed jets, quite direct evi-
dence of the presence of the disk has been obtained, for example, by the instruments
of the Hubble Space Telescope. The regions where the wide and narrow lines should
form are inferred only indirectly, although there is a consensus that this inference
is based on very reasonable assumptions. If the Unified Model is correct, we infer
that the AGNs in which the study of jets is particularly important are the blazars,
since we would be observing their axis directly. Besides being candidates to accel-
erate particles to extreme energies (Chap. 12), jets provide important clues about
8.2 Types of AGN and the Unified Model 183
the environment, such as the magnetic field, density, and other features, maintaining
collimation over 50 kpc or more.
While the classification of AGNs according to their radio output goes back to
A. Sandage in the 1960s, deeper studies have resulted in a still more refined proposal
[6], based not on observational appearance, but rather on the physical model. An
examination of the energy output throughout the whole electromagnetic spectrum,
its observational biases, and the model behind the emission led to the suggestion that
the actual difference between AGNs, based on their X-ray and γ -ray emission, is
the presence or absence of a jet [6, 7]. (This is quite analogous to the situation of
the supernovae, where the original classification did not say much about the physical
origin of the explosions and progenitors.) This suggestion is based on the fact that
the radio-quiet AGN class is the dominant population, composed of quasars, while
the radio-loud AGNs include a variety of objects emitting non-thermally across the
spectrum. Thus, they should be intrinsically different physically. This identification
is reinforced by the fact that radio-quiet quasars are actually radio-faint (that is, the
ratio of their radio-to-optical emission is low), but they are γ -quiet. The idea is that
they lack the physical component that is responsible for the highest energy emission,
while the observed thermal component is attributed to the disk.
Padovani and colleagues suggest that the jet is the main difference between the
two types, and suggest that the classes should be named “jetted” and “non-jetted”
according to whether they have high-energy emission or not. A small set of features,
like the presence of a radio excess away from the known far IR–radio correlation,
may provide a way to differentiate without the old denomination. Nevertheless, the
basic physical picture shown in Fig. 8.5 stands, although the reasons for the presence
of a jet that would put the AGN into the “jetted”, high-energy emitting category are
yet to be clarified.
The presence of quasars in the primordial Universe—the oldest and most distant one
detected was formed when the Universe was about 10% of its present size, at a redshift
greater than 7.5—means we need to think about the nature of the phenomenon in
relation to galaxy formation. According to the most widely accepted model, structure
begins to form in the Universe as soon as the growth of density inhomogeneities
becomes possible around z ∼ 20, and as cosmic time proceeds, the number of quasars
is observed to grow very rapidly, reaching a peak at z ≥ 2 and then falling off. Around
z = 2, almost 10% of galaxies contain quasars, but that number is almost zero today
(Fig. 8.6).
As an example, our own galaxy hosts a supermassive black hole in the center
Sgr A∗ , but this does not mean that we are living inside an AGN. This BH is said
to be dormant or starved. There is no steady accretion to power it, and only from
184 8 Active Galactic Nuclei (AGNs)
time to time a star or cloud is captured, leading to a brief transient event. So, what
is the actual relationship between quasars and galaxies? We have seen that galaxies
that host them are often difficult to observe, and there are cases where the quasar
is not hosted by any galaxy. But if the association is real, it may be that the galaxy
formation process is somehow associated with quasars, as quasars would regulate
the energy released by the star formation that ultimately constitutes the host galaxy.
In fact, when it became possible to study the central regions of galaxies (bulges)
and to estimate the masses of black holes in their centers, an important discovery
was made: the mass of the central black hole is strongly correlated with the velocity
dispersion σ of the stars in the bulges. More precisely, MBH ∝ σ 4 (Fig. 8.7). The
general interpretation of this correlation is that there is a “symbiosis” between the
central black hole and the formation of the inner region, i.e., it looks like the black
hole and the host galaxy formed simultaneously [8].
A very simple idea that lends quantitative support to this MBH –σ correlation can be
obtained by considering the following scenario. Suppose that the radiation pressure
generated in the AGN pushes out on the gas in the center, and suppose that that gas
corresponds to a fraction f of the mass of the bulge, i.e., Mgas = f Mbulge . Using
(7.9) and balancing the pressure force with the gravitational attraction on the gas, we
can write
LE G Mbulge f Mbulge
= . (8.5)
c R2
One possibility now is to assume that the inner part of the radius R in the bulge cor-
responds to an isothermal sphere, where the mass and the velocity dispersion of the
“particles” obey the relation Mbulge = 2Rσ 2 /G. Thus, Mbulge /R = 2σ 2 /G, and sub-
stituting this in (8.5), we obtain L E /c = G f (2σ 2 /G)2 . Using L E = 4π G MBH m p /σT
from (7.9), we have finally
8.3 AGNs and Structure Formation in the Universe 185
σT f σ 4
MBH = . (8.6)
π G2mp
Although it seems a plausible explanation, (8.6) has been criticized. For example, the
inner bulge may not correspond to an isothermal sphere and it is quite possible that
there are other factors besides the radiation pressure that contribute to the observation
that MBH ∝ σ 4 in Fig. 8.7.
In summary, the largest accreting systems in the Universe, powered by supermas-
sive black holes, formed and evolved from rather high redshifts, greater than 5, and
there are even some with z > 7. This means that the Universe formed these monster
black holes in a relatively short period of time after the Big Bang, something that
until recently was considered impossible, but which already seems viable by direct
collapse of the gas under the primordial conditions, at least up to the scale of 106 M .
The black hole then feeds for billions of years from the ambient accretion, until Ṁ
declines sharply in the local Universe, and the supermassive black hole begins to
“starve”, like the black hole in Sgr A∗ at the center of our galaxy, becoming latent,
i.e., no longer offering strong signs of its presence. It is clear that the AGN will
“vanish” as soon as there is no longer any ambient gas, and this is associated with
the evolution of the host galaxy. Otherwise we would observe many nearby quasars
still “in operation”, and this is not the case. It is hard to obtain accurate statistics for
the AGN population, one reason being that the presence of dust makes observation
difficult, and may still be hiding a fraction of them [7].
186 8 Active Galactic Nuclei (AGNs)
References
1. V.M. Slipher, Radial velocity observations of spiral nebulae. The Observatory 40, 304 (1917)
2. E.P. Hubble, Extragalactic nebulae. Astrophys. J. 64, 321 (1926)
3. B. Peterson, An Introduction to Active Galactic Nuclei (Cambridge University Press, New
York, 1997)
4. V. Beckmann, C. Shrader, Active Galactic Nuclei (Wiley-VCH, New York, 2012)
5. C.M. Urry, P. Padovani, Unified schemes for radio-loud active galactic nuclei. ASP Conf. Proc.
107, 803 (1995)
6. P. Padovani, On the two main classes of active galactic nuclei. Nature Astronomy 1, 0194
(2017)
7. P. Padovani et al., Active galactic nuclei: what’s in a name? Astron. Astrophys. Rev. 25, 2
(2017)
8. H. Spinrad, Galaxy Formation and Evolution (Springer, Berlin, 2005)
9. P. Madau, F. Haardt, M.J. Rees, Radiative transfer in a clumpy Universe. III. The nature of
cosmological ionizing sources. Astrophys. J. 514, 648 (1999). https://doi.org/10.1086/306975
10. N.J. McConnell, C.-P. Ma, Revisiting the scaling relations of black hole masses and host galaxy
properties. Astrophys. J. 764, 184 (2013). https://doi.org/10.1088/0004-637X/764/2/184
Chapter 9
Neutrino Astrophysics
The construction of a viable model for the weak interactions and the subsequent
discovery of neutrinos postulated by W. Pauli was presented in Chap. 1. We saw that
conversions between neutrons and protons—beta decay and inverse beta decay—
involve the emission of neutrinos and antineutrinos. As these processes are common
in a variety of high energy situations, efforts were made to detect and use them for
studies in Astrophysics. Although typical neutrino cross-sections are 20 orders of
magnitude smaller than those of electromagnetic interactions, we will see that this
possibility, raised in the early 20th century, led to the current success of the nascent
neutrino Astrophysics [1].
Several decades after the original Pauli hypothesis, the existence of 3 types of
neutrinos (and their respective antineutrinos) had been confirmed in the laboratory,
with one for each generation of the Standard Model: νe , νμ , and ντ . This became
relevant for our view of the neutrino detection problem: since only protons and
neutrons are involved at low energies, we should expect to detect mainly electron
neutrinos and antineutrinos (νe and ν̄e ). This is no longer true if the temperatures and
densities are very high, in which case all three types may be emitted. We will see
that this is the case in the gravitational collapse of supernovae.
The fundamental quantity we need to know when trying to detect neutrinos is the
cross-section, which is a measure of the effective area over which a neutrino interacts
with other matter matter. The phenomena of beta and inverse beta decay suggest that
appropriate “targets” should be protons and neutrons, although electron scattering
may also be relevant.
Let us suppose that a neutrino collides with a fixed target, for example, a proton.
Within the black “dot” of Fig. 9.1, the weak interaction, which is actually mediated
by one of the W± bosons of Chap. 1, contributes a factor of G 2F . Then, since G F ∝
(energy)−2 , and we have already seen in Chap. 1 that, when we throw a projectile
against a target, the length probed satisfies (length) = (energy)−1 as in (1.3), the only
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 187
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_9
188 9 Neutrino Astrophysics
Fig. 9.1 Neutron decay in Fermi’s phenomenological theory. The process of emitting a W− boson
with subsequent decay of the latter into e− + ν̄e is “summarized” by the black dot
where σ0 ≈ 10−44 cm2 , or almost 20 orders of magnitude smaller than the electro-
magnetic cross-section given in (2.6). The millibarn unit ≡ 10−27 cm2 is normally
used in scattering problems, but even this figure is actually very large for the purposes
of neutrino Astrophysics. This justifies the name “weak interactions” and is used to
design experiments where it is important to know beforehand that the cross-section,
and with it the rate of events, will increase with increasing energy of the incident
neutrinos. The more energetic neutrinos are the ones that have the greater probability
of interacting.
One can identify two basic types of interaction that can lead to detection: the first
is an interaction in which a neutrino provokes a reaction, and with it a change in
some chemical element. An example of this type of reaction is
νe + 37 Cl → e− + 37 Ar . (9.3)
Here, a weak interaction converts a neutron bound to the chlorine nucleus into a
proton, and thus the “daughter” atom after the reaction is chemically different and
can in principle be separated. The 37 Ar count contains information about the number
of neutrinos that have reacted. The second type would be a scattering reaction like
νe + e − → νe + e − . (9.4)
9.1 Neutrinos and Their Detection 189
Here it is the charged particle (electron) that can be detected, for example, by the
Čerenkov radiation it produces when travelling in water or another liquid. The
energy and other parameters of the neutrinos can be obtained by measuring this
radiation.
Figure 9.2 shows a realistic estimate of the neutrino cross-section expected from
various sources for which the energy E ν can be reasonably well estimated. From the
“relic” Big Bang neutrinos (analogous to the photons in cosmic background radiation)
to the end of the spectrum, including a possible resonance due to the finite mass of
the neutrino (see below), the predictions yield very low values, albeit increasing with
energy. In the figure we can observe the change in slope corresponding to the “low
energy” → “high energy” transition between the limits of (9.1) and (9.2) around
E 0 ≈ 105 MeV.
From elementary kinetic theory we can calculate the rate of events in a situation
like the incidence of neutrinos onto nucleons, which constitutes a typical fixed target
experiment. If we consider an incident neutrino beam with density n ν covering an
area A, the flux onto the targets will be = n ν Ac. These neutrinos focus on the
target, where the density of nucleons is n N and the thickness is l. The number of
events will be proportional to the flux and the cross-section σ , so the total number
of events expected per unit time is
Nt = σ n ν νN Acl . (9.5)
Now, a source can produce a continuous flux (constant in time) or a pulsed flux
(limited duration). The Sun corresponds to the first type of source, where one can
integrate over long times to calculate fluxes and other parameters of the incident
neutrinos and the source. The collapse supernovae discussed in Chap. 5 are of the
second type: they produce a “burst” of neutrinos in a few seconds with an enormous
luminosity, after which the flux drops to undetectable levels. We will discuss these
two cases below, as they correspond to the two best known sources among those
detected and studied.
190 9 Neutrino Astrophysics
After World War II, detection techniques and related technologies had advanced to
the point where it became feasible to think about experiments to detect neutrinos
directly. Theoretical ideas had also been consolidated, and astrophysical estimates
had improved greatly, converging to more stable values with a consequent gain in
confidence on the part of experimental physicists who were contemplating building
detectors.
As discussed in Chap. 4, the basic nuclear reactions of the p–p cycle lead to the
production of neutrinos in several of their stages, either with fixed energies (“monoen-
ergetic”) or distributions that have a maximum of energy due to kinematic restric-
tions (“continuous”). Figure 9.3 shows the stages of the reactions where neutrinos
are emitted and escape from the Sun.
Given a complete model of the Sun, one can also calculate the fluxes i emitted
by each reaction, in addition to the total flux [4]. Adding all fluxes shown in Fig. 9.4,
the total is approximately 7 × 1010 neutrinos cm2 s−1 . This number is gigantic: more
than 100 billion neutrinos pass through a human hand held out toward the Sun every
second! But the tiny cross-section is responsible for essentially all of them going
straight through, without interacting with the matter in the hand. Given these facts,
the obvious question is: how could one possibly detect these neutrinos in order to
measure the individual spectra of Fig. 9.4 and the total emission of the Sun?
Fig. 9.5 Davies’ experiment at the Homestake mine (left) and the results obtained there up to 1994
(right) [7] © AAS. Reproduced with permission. The short vertical segment on the right is the
average of these 24 years of operation, equal to about 1/3 of the theoretical expectation
One of the pioneers in this area was Ray Davies Jr., who was the first to consider
building an experiment that could measure solar neutrinos. Davies relied on the
increasingly sophisticated calculations of fluxes due to John Bahcall and others. An
example of the latest predictions is shown in Fig. 9.4. Note that the solar neutrino
unit (SNU) was defined for convenience: it is equal to the neutrino flux producing
10−36 captures per target atom per second.
Figure 9.5 shows that, year after year, the experimental count fell short of the pre-
dictions of solar models. Calculations indicated that some 8 SNU should be measured
due to neutrinos with energies above 0.814 MeV, which was the detection threshold
that would make it possible to see the monoenergetic flux of 7 Be neutrinos. However,
the average of the measurements taken in the 40 years of operation was 2.56 ± 0.16
(systematic) ±0.16 (statistical) SNU. Evidently, this contradiction between theory
and experiment cried out for an explanation.
A new series of experiments was planned and executed to clarify the situation.
The most important ones, sensitive to low-energy neutrinos, were the SAGE and
GALLEX collaborations, which used gallium tanks to measure neutrinos produced
in reactions such as
νe + 71 Ga → e− + 71 Ge . (9.6)
Because of the characteristics of gallium in the target compound (GaCl3 –HCl), the
detection threshold (E ν = 0.42 MeV) is much lower than the one for 37 Cl in (9.3).
Thus, the neutrinos of the first p–p reaction, i.e., p + p → d + e+ + νe , could be
detected with energies E ν > 0.42 MeV. For the first time, the SAGE and GALLEX
experiments were able to see one of the main branches of the p–p chain responsible
for most of the energy production in the solar interior, instead of the single energy
neutrinos from the beryllium reaction (Fig. 9.4). In fact, the prediction of theoretical
models was 69.6 SNU, an order of magnitude greater than the prediction for beryllium
neutrinos.
Measurements were made for several years to finally announce that the SAGE
results observed a fraction 0.517+0.042 +0.055
−0.044 (systematic) −0.053 (statistical) of the
192 9 Neutrino Astrophysics
prediction, while GALLEX communicated 0.601+0.059 −0.060 (total errors). Not only did
the two experiments have consistent results with slightly more than half of the predic-
tion, but as they had been “calibrated” by observing neutrinos from nuclear reactors,
whose flux was precisely known and where it was established that they could pick up
0.95 ± 0.05 of the total, it became clear that something was wrong with the neutrinos
arriving from the Sun [8].
Supplementing these experiments, other collaborations performed neutrino mea-
surements at higher energies. Although less numerous than neutrinos produced in
the p–p reaction, these provide a key element for assembling the puzzle. Before
resuming our description of the quest for the Sun’s total neutrino budget, it is impor-
tant to note that measurements of particular neutrino rates can tell us much about
the fusion processes going on inside the solar core. We may cite two outstanding
examples provided by the BOREXINO Collaboration. The first in 2014 reported
measurement of the p–p reaction [9], the fundamental process inside the Sun, giv-
ing excellent agreement with solar model predictions. And very recently, the same
group reported an accurate measurement of neutrinos produced in the CNO cycle
inside the Sun (Fig. 4.10), finding a rate corresponding to around 1% of the whole
energy budget [10], but expected to grow rapidly and dominate energy generation in
the upper Main Sequence M ≥ 2M . It is important to note the degree of accuracy
and sensitivity of experiments like this, which allow us to “see” the interior of the
Sun and confirm to some extent the nuclear Physics inputs in our theories of stellar
structure and evolution.
The most relevant experiments are shown in Fig. 9.6. The first is the Sudbury
Neutrino Observatory (SNO), a huge tank that brought together (on loan) the totality
of Canada’s heavy water. Heavy water has deuterium rather than hydrogen in its
composition, and deuterium supports reactions allowing us to detect not only electron
neutrinos νe , but also neutrinos associated with the muon and the tau. The basic
detectable reactions in heavy water are
Fig. 9.6 The Sudbury Neutrino Observatory (SNO) [11] (left) and Super-Kamiokande [12] (right)
experiments. Note the size of these installations, as evidenced by the human operators and the
inflatable boat. Credits: MIT group at SNO and Kamioka Observatory, Institute for Cosmic Ray
Research, University of Tokyo, respectively
9.2 Neutrino Sources: Solar Neutrinos 193
νe + d → p + p + e − , (9.7)
νx + d → p + n + νx , (9.8)
ν x + e− → ν x + e− , (9.9)
where νx symbolizes either of the neutrinos νμ and ντ . These reactions are mediated
by the charged W and neutral Z bosons of the weak interactions (Chap. 1), while
the third reaction (9.12) is of the elastic scattering-type exemplified by (9.3). The
advantage of heavy water is that it sees all neutrinos, whence the total flux and partial
contributions can be measured, although only at higher energies.
The experiment shown on the right in Fig. 9.6 is Super-Kamiokande, a cylindrical
tank containing 50000 tons of pure water, surrounded by photomultipliers (visible
in the image). This configuration is sensitive only to electron neutrinos, but with two
important features: it can detect Čerenkov radiation and hence determine the energy
in the reactions, while the photomultipliers determine the direction of the incident
neutrino with reasonable accuracy.
These experiments could check the feasibility of the simplest (“astrophysical”)
solution to the solar neutrino problem: a Sun with a slightly cooler temperature at
the center. The flux of the 7 Be neutrinos, which were the target of the experiment by
Davies that revealed the discrepancy, is proportional to a high power of the central
22
solar temperature, viz., TC , so a reduction of only 4% in the central temperature
would have been enough to explain the reduced rate originally detected. However,
Super-Kamiokande measurements of the neutrino spectrum for the elastic scattering
reaction shown in Fig. 9.7 proved that most of the “missing” neutrinos were those
of lower energy, while a hypothetical decrease in temperature would lead to the
Fig. 9.7 Measurement of the neutrino spectrum [13] compared to the standard solar model (SSM)
prediction (upper discrete bars), showing the lack of lower energy neutrinos and the gradual conver-
gence between theory and experiment up to around 15 MeV. The so-called “astrophysical” solution
was discarded because lowering the temperature of the interior by 4% would produce the opposite
behavior: the missing neutrinos would be the ones with the highest energies
194 9 Neutrino Astrophysics
immediate disappearance of the higher energy neutrinos (those at the tail of the dis-
tribution). Thus, the astrophysical solution was discarded, and suspicions regarding
the behavior of the neutrinos themselves increased. As an additional contribution, the
SNO measurements showed that the sum of the fluxes of the three neutrino types coin-
cided with the total expected flux, while Super-Kamiokande measured 0.544 ± 0.037
(systematic) ±0.064 (statistical) for the electron neutrinos, i.e., almost half of the
expected flux in the latter was missing.
Following these announcements by Super-Kamiokande and SNO, the whole com-
munity agreed that the problem of the lack of neutrinos did not reside in the Sun,
but rather in their “disappearance” on their way to the detectors. These suspicions
had already been raised by theoretical physicists: while there are reasons for arguing
that the photon mass should be zero, there is no fundamental reason for the neutrino
mass to be zero. The experimental fact that it needs to be much smaller than the
mass of other known particles (electrons, quarks) does not necessarily indicate that
it should be zero, although it would be desirable to find some mechanism or reason
to explain this hierarchy. Thus, two decades before the first results were obtained for
the neutrino flux, B. Pontecorvo [14] suggested that oscillation might be possible
between neutrinos of different types, by analogy with the case of kaons, which also
oscillate between two states, as a mechanism that would change the emerging flux.
The concept of oscillation between two or more states is not difficult to understand.
In Quantum Mechanics, particles are characterized by conveniently labeled states.
Technically, a state is a vector, analogous to the well known position vectors in three-
dimensional space, although it belongs to a Hilbert space of infinite dimensions. In
the same way that any vector has components relative to a set of axes, for example
A = ax + by in two dimensions, any state vector can be expressed in terms of a basis
of state vectors, called eigenstates. A real electron or muon neutrino (ignoring the
tau for the moment) will be a linear combination of mass eigenstates |ν1 and |ν2
with masses m 1 and m 2 :
Fig. 9.8 Left: Detection of neutrino oscillations on Earth, with the associated determination of the
oscillation length [17]. Right: Measurements of the oscillation probability compared to predictions
(solid line) based on previously determined parameters [18]
196 9 Neutrino Astrophysics
masses of the participating neutrinos. The results show that these masses are of the
order of 10−2 eV, and that there are several possibilities for the mixing angle. But the
fact is that the Sun does indeed work in the way proposed by our theories of Stellar
Evolution, and that the origin of the lack of neutrinos stems from the properties of
the neutrinos themselves.
One last point is that it is not yet possible to state whether the mass hierarchy is
“normal” (m 1 < m 2 < m 3 ) or “inverted” (m 1 < m 3 < m 2 ). This will be important
if we are to build a viable model of massive neutrinos. The Hyper-Kamiokande
experiment (in progress [20]) and DUNE (see below) may collect enough data to
answer this question in a few years.
As we saw in Chap. 5, the sequence of events that leads to the implosion and sub-
sequent explosion of massive stars has as its fundamental protagonist the “Fe” core
produced during the previous stages. The shock developed by the fall of matter onto
the hardened region that exceeds the nuclear saturation density is formed at the sonic
point and advances later, but may not be the cause of the explosion. This is due to the
tremendous losses it suffers on its way to the edge of the core. However, throughout
this process, the degenerate core composed of nuclei and electrons is transformed
into what we call a neutron star. For this to happen, compaction from the original
configuration to the final one is mandatory, passing through an intermediate stage (the
proto-neutron star) where the energy difference that needs to be radiated is emitted
by the compact object (Fig. 9.9)
The origin and temporal sequence of the neutrinos in the collapse deserves to be
highlighted. Matter is neutronized in the implosion to the left of Fig. 9.9, continuing
throughout the collapse, the electrons being “forced” to combine with protons in the
Fig. 9.9 The implosion of the “Fe” core of a massive star is fast and neutronizes all its contents,
dissolving the nuclear structure (left). A shock forms at the sonic point of the core and advances up
to 100–200 km before losses bring it to a halt (gray region in the center). The neutron proto-star has
a greater radius than it will have when the neutrinos have just left, possibly re-energizing the shock
on diffusion scales (1–2 s, right). These neutrinos, which carry the binding energy, are detectable
on Earth
9.3 Neutrino Sources: Supernova 1987A 197
inverse β reaction
p+ + e− → ν + n . (9.13)
This process happens first with protons connected to the nuclei, which we can write
as e− + A → A(Z − 1). Electrons are captured and form neutrons, but the nuclei
also disappear when the compression is sufficient—which is basically what it means
to speak of the nuclear saturation density ρ0 . When the matter reaches this density,
the central region suddenly stiffens and the falling envelope “bounces” on it, although
we do not know the final sequence of events that leads to ejection of the envelope.
Regardless of the details, the mass of the original iron core, which had a radius of
about 1500 km, is compressed into a sphere of tiny radius, about 30–40 km, but with a
huge content of thermal energy, dissipated as a direct product of its implosion. Thus,
in order to reach a stable configuration, the neutron sphere must radiate (i.e., get rid
of) this excess thermal energy. However, instead of doing so by means of photons
from the surface, the associated temperatures are so high that it is much more effective
to emit neutrinos from the interior instead. These neutrinos are not the same as those
that came out in the “burst” of neutronization while implosion occurred, described
in (9.13), but are rather the product of particle–antiparticle annihilations of the type
e+ + e− → ν + ν̄ and similar processes such as neutron–neutron bremsstrahlung
n + n → n + n + ν + ν̄ later on. Due to their origin, the neutrinos to be radiated are
sometimes called thermal neutrinos [21].
As stated above, the neutrinos should take away most of the binding energy, that
is, the difference
|E “Fe” − E NS | ≈ 1053 erg , (9.14)
and have to flow in the dense environment that interacts with them, putting them in the
diffusion regime. It is reasonable to assume that the associated luminosity L ν ∝ Rν2 Tν4 ,
that is, it behaves as a neutrino black body. The fact is that, while the neutrinos leak
out in a few seconds, the proto-neutron star adopts its final configuration which will
remain unchanged for many millions of years. Thus, purely theoretical considerations
point to an burst of neutrinos lasting just a few seconds as the signature of the
supernova “core”, from which no other signal except this could arrive (apart from
gravitational waves, if they are produced; see the next Chapter). In the supernova
core, the neutrinos themselves are degenerate, since although the cross-section is very
small, the density grows enormously and prevents them from escaping freely. Thus,
the neutrinos are expected to be distributed according to the Fermi–Dirac function
shown in Fig. 9.10.
In February 1987, a supernova in the Large Magellanic Cloud was identified in
an observation beginning at 24.06 UT by astronomer Ian Shelton (Las Campanas
Observatory, Chile). According to conventions, it received the name SN 1987A. The
progenitor star (identified with the catalog name Sanduleak −69 202) was compared
to the predictions of the theory of Stellar Evolution. The star must have remained
in the Main Sequence for 107 yr, exhausted the hydrogen in its core some 700 000
years ago, passed through a Cepheid stage, and established itself in the helium Main
Sequence 650 000 years ago, then leaving it 45 000 years ago and continuing along
198 9 Neutrino Astrophysics
Fig. 9.10 Fermi–Dirac distribution. All possible energy states are filled up to the maximum (the
Fermi level E F ) for T1 = 0, resulting in the abrupt step shown with dashed lines. For a finite T2 , there
is room in energy states of order kB T2 near the Fermi surface (solid line) and neutrinos should be
emitted with energies of this order, since reactions are only possible if they are not fully degenerate
its accelerated track: carbon ignition 10 000 years ago, neon in 1971, oxygen in 1983,
silicon on February 13, 1987, before finally exploding 10 days later. Its effective tem-
perature for most of its existence was over 30 000 K. This was the closest supernova
in almost 400 years, and allowed a detailed study of the light curve and other charac-
teristics. However, the most important feature was that, for the first time in history,
neutrino detectors were operating and were able to contribute in a remarkable way
to our knowledge and understanding of the collapse.
Several detectors in operation at the time (Kamiokande, Baksan, IMB) obtained
clear evidence of the existence of a neutrino burst in their data, prior to the optical
detection time of the explosion—the neutrinos precede the shock breakout by sev-
eral hours, because the latter needs this long to reach the surface, while neutrinos
escape from the core immediately. However, the distance to the Large Magellanic
Cloud conspired against the production of an intense signal, and only 21 neutrinos
were definitely associated with the event (Fig. 9.11), since these detections showed
the location inside a cone of aperture a few degrees around the reconstructed axis.
This demonstrated that the source of the neutrinos was SN1987A. Thus an attempt
was made to “reconstruct” the physical characteristics of the supernova, since the
temperature, emitted energy, and other important quantities could be inferred from
these data.
Under the hypothesis of neutrino black body emission, there is a simple relation-
ship between the average energy of the emitted neutrinos and the temperature at the
source, which can be written as
∞ 3
f d F3 (0)
source = 0∞ 2 =T = 3.15T , (9.15)
0 f d
F2 (0)
where G 4 (H/T ) and G 5 (H/T ) are truncated Fermi functions (also called Fermi
functions of the second kind) which are easily calculable numerically. The detector
efficiencies of Kamiokande (Japan) and IMB (USA) can be modeled by the func-
tions WK = 1 − 4.9 exp(−/3.6 MeV) and WIMB = 1 − 3 exp(−/16 MeV), and
the minimum threshold values HK = 7 MeV and HIMB = 20 MeV, respectively, i.e.,
because of their construction it is not possible to detect neutrinos with energies lower
than HK or HIMB .
Now the simplest procedure is to calculate the average energy directly from the
data det ≡ i /Nν det , and then solve (9.16) to find the temperature T numerically;
this is the temperature of the emitting neutrinosphere Tν . With this temperature, one
can then calculate the average energy at the source source using (9.15), obtaining
the total emitted energy from the source as
2
D G5 Nν det
E source = 0.77 × 1053 F3 (0) erg . (9.17)
50 kpc G 24 det Mdet
Another quantity that is directly comparable with the data is the cumulative event
rate 2 5
dNν det −8 50 kpc T
= 5.2 × 10 Mdet C G 4 , (9.18)
dt D MeV
The results of the predictions for the Kamiokande detectors using (9.18) [23] and
IMB [22] came out to be quite reasonable when compared with the actual accumu-
lated event detections, although they depend on the existence of convection inside
the proto-neutron star model, which increases the rate by producing a higher effec-
tive temperature in the neutrinosphere. Although today the work of modeling events
points to a complexity that makes simple prediction difficult due to the effects of
instabilities discussed in Chap. 5, the simplest picture agrees to within an order of
magnitude with what is observed.
One of the most important parameters determined in this analysis was the neu-
+1.2
trinosphere temperature Tν = 4.2−0.8 MeV, in good agreement with the simplest
models. The temporal evolution of this temperature, and even the possibility of a
sequential emission with a “hiatus” of several seconds, cannot be discarded [24],
and it may provide evidence for more complex Physics [25]. Finally, the total radi-
ated energy at the source has been compared with the theoretical binding energy for
several equations of state proposed for neutron matter, finding consistency, but also
leaving undetermined the kind of composition that could have led to the detected
events. Note, however, that (9.17) only takes into account the detection of electron
antineutrinos, since the electron neutrinos and the muon and tau pairs have very small
cross-sections. Thus, the result of (9.17) is usually multiplied by 6 to estimate the
total energy carried equally among the 6 types of neutrino produced (e, μ, and τ
neutrinos and their respective antineutrinos).
The total radiated energy was 2.5 ± 1 × 1053 erg [26], assuming the formation of
a neutron star of mass 1.4M , which may be underestimated, as we saw in Chap. 6.
The range of measured masses of neutron stars is today much wider thanks to the
measurements of several systems, something that was not known in 1987 (Chap. 5).
The explosion of SN1987A was a fundamental event for neutrino Astronomy,
since until then only solar neutrinos had been unequivocally detected. As we have
seen, the supernova produced important information about one of the most extreme
phenomena in the present Universe. Had a supernova happened inside our own galaxy
at a distance D ∼ 1 kpc, it would have allowed the experiments to register some
20 × (50 kpc)2 neutrinos, something in the range of 1000–2000, with enormous
gains in understanding the mechanism of gravitational collapse. It should be noted
that, although it allowed an insight into many fundamental details of the theory under-
lying this kind of collapse and explosion, we were unable to identify the explosion
mechanism itself. And despite numerous subsequent observation attempts, the com-
pact object that was born in the SN 1987A explosion which resulted in the source of
the observed neutrinos, i.e., the transformation of the iron core into a neutron star,
was never observed, adding a dose of mystery to this problem, in fact leading some
researchers to suggest the late formation of a black hole. However, there are recent
good indications that a “hot spot” lies in the middle of the SN 1987A remnant, as sug-
gested by ALMA, NuSTAR, and Chandra data. In particular, a non-thermal emission
in the range 10–20 keV has been reported by Greco et al. [27] and tentatively asso-
ciated with a pulsar-wind nebula powered by a young object. If confirmed, this will
be the first fully recorded birth of a neutron star, from the explosion itself to infancy.
9.3 Neutrino Sources: Supernova 1987A 201
Fig. 9.12 IceCube Čerenkov detector located in Antarctica, which has been operating since 2010.
The strings are buried right through the ice to the bedrock, almost 3000 m below. The array itself
is located at 1450 m and below. DeepCore’s strings are much closer and there are high efficiency
photomultipliers inserted in the central part [28]. Credit: Felipe Pedreros, IceCube/NSF
Finally, we remark that new facilities are being built and operated for the obser-
vation of astrophysical, cosmological, and geophysical neutrinos, the aim being to
investigate processes that end up producing neutrinos in the final state and also to
better understand the neutrino masses and the Physics of their interactions.
One good example of these advanced facilities is the IceCube Neutrino Obser-
vatory (Fig. 9.12), located near the South Pole. The pristine Antarctic ice generates
Čerenkov radiation when neutrinos pass through it, and strings of photomultipliers
detect this radiation as described in Chap. 2. This experimental array detected neu-
trinos with PeV energies and can seek point sources and temporal coincidences with
other phenomena (GRBs, for example).
Another important initiative devised to study the Physics of neutrinos is the Deep
Underground Neutrino Experiment (DUNE) in the USA. The experiment will use
a neutrino beam generated by the Fermilab proton accelerator facility and measure
particle interaction near the source, while a second detector located at a distance
of 1300 km in a South Dakota underground site (Sanford Underground Research
Facility) will measure the beam again to determine, among other things, the neutrino
masses and hierarchy (that is, the relation between the masses of the three known
flavors). It is clear that neutrino research will remain a key part of high energy
Astrophysics for years to come.
202 9 Neutrino Astrophysics
References
1. E. Waxman, Neutrino Astrophysics: A new tool for exploring the Universe. Science 315, 63
(2007)
2. T.D. Lee, Symmetries, Asymmetries, and the World of Particles (Jessie & John Danz Lectures)
(University Washington Press, Washington, 1987)
3. J. Bahcall, Neutrino Astrophysics (Cambridge University Press, Cambridge, UK, 1989)
4. J. Bahcall, A. Serenelli, S. Basu, New solar opacities, abundances, helioseismology, and neu-
trino fluxes. Astrophys. J. Lett. 621, L85 (2005)
5. N. Vinyoles et al., A new generation of standard solar models. Astrophys. J. 835, 202 (2017)
6. The Borexino Collaboration, Comprehensive measurement of pp-chain solar neutrinos. Nature
562, 505–510 (2018)
7. B.T. Cleveland et al., Measurement of the solar electron neutrino flux with the homestake
chlorine detector. Astrophys. J. 496, 505 (1998). https://doi.org/10.1086/305343
8. M.F. Altmann, R. Mossbauer, L.J.N. Oberauer, Solar neutrinos. Rep. Prog. Phys. 64, 97 (2001)
9. BOREXINO Collaboration, Neutrinos from the primary proton–proton fusion process in the
Sun. Nature 512, 383 (2014)
10. BOREXINO Collaboration, Experimental evidence of neutrinos produced in the CNO fusion
cycle in the Sun. Nature 587, 577 (2020)
11. http://web.mit.edu/josephf/www/nudm/SNO.html
12. http://www-sk.icrr.u-tokyo.ac.jp/index-e.html
13. J. Hosaka et al., Three flavor neutrino oscillation analysis of atmospheric neutrinos in Super-
Kamiokande. Phys. Rev. D 74, 032002 (2006)
14. B. Pontecorvo, Reviews of topical problems: The neutrino and its role in Astrophysics. Sov.
Phys. Uspekhi 6, 1 (1963)
15. S.P. Mikheyev, AYu. Smirnov, Resonance enhancement of oscillations in matter and solar
neutrino spectroscopy. Yad. Fiz. 42, 1441 (1985)
16. L. Wolfenstein, Neutrino oscillations in matter. Phys. Rev. D 17, 2369 (1978)
17. K. Eguchi et al., (KamLAND Collaboration), First results from KamLAND: Evidence for
reactor antineutrino disappearance. Phys. Rev. Lett. 90, 021802 (2003)
18. S. Abe et al., (KamLAND Collaboration), Precision measurement of neutrino oscillation param-
eters with KamLAND. Phys. Rev. Lett. 100, 221803 (2008)
19. K. Ichimura, Recent results from KamLAND, in Proceedings of the 34th International Confer-
ence in High Energy Physics (ICHEP08), Philadelphia, 2008, eConf C080730 (2008). Avail-
able at https://arxiv.org/abs/0810.3448
20. http://www.hyper-k.org/en/physics/phys-hierarchy.html
21. A. Burrows, A brief history of the co-evolution of supernova theory with neutrino Physics, in
Proceedings of the Conference on the History of the Neutrino, eds. J. Dumarchez, M. Cribier,
and D. Vignaud. arXiv:1812.05612
22. R.M. Bionta et al., Observation of a neutrino burst in coincidence with supernova 1987A in
the Large Magellanic Cloud. Phys. Rev. Lett. 58, 1494 (1987)
23. K. Hirata et al., Observation of a neutrino burst from the supernova SN1987A. Phys. Rev. Lett.
58, 1490 (1987)
24. T. Loredo, D.Q. Lamb, Bayesian analysis of neutrinos observed from supernova SN 1987A.
Phys. Rev. D 65, 063002 (2002)
25. O.G. Benvenuto, J.E. Horvath, Evidence for strange matter in supernovae? Phys. Rev. Lett. 63,
716 (1989)
26. A. Burrows, J.M. Lattimer, Neutrinos from SN 1987A. Astrophys. J. Lett. 318, L63 (1987)
27. E. Greco et al., Indication of a pulsar wind nebula in the hard X-ray emission from SN 1987A.
Astrophys. J. Lett. 908, L45 (2021)
28. M. Ahlers, K. Helbing, and C. Pérez de Los Heros, Probing particle Physics with IceCube. Eur.
J. Phys. C 78, 924 (2018)
Chapter 10
Gravitational Waves
Physics has known about and dealt with wave phenomena for a long time. In fact,
Physics courses feature a variety of treatments of waves in fluids, electromagnetic
waves, and related subjects. On very general grounds we may define a “wave” as a
solution of a wave equation. This is not a mere tautology, but an accurate statement
that includes many possibilities related to the Physics of wave phenomena. More
specifically, a wave equation may contain a variety of terms, but two mandatory
ingredients are the second time derivative and the second spatial derivative of some
dynamical variable to be determined. Gravitational waves are a true revolution in the
understanding of compact objects and gravity itself, a frontier opened a few years
ago of high importance in the field. The wave equation treats temporal and spatial
coordinates on the same footing, and in its simplest form reads
∂2 1 2
− 2∇ A = 0 , (10.1)
∂t 2 v
where the symbol ∇ 2 , called the Laplace operator, represents the second-order spatial
derivatives, so, for example, ∇ 2 = ∂ 2 /∂ x 2 in Cartesian coordinates in one dimension
(Fig. 10.1). The wave of amplitude A in (10.1) propagates with velocity v.
A well-studied case is the electromagnetic wave. We know that the magnetic field
B and electric field E satisfy wave equations in vacuum and propagate with the speed
of light c, whenever they are produced by suitably accelerated electric charges. It
is always possible to express the resulting electromagnetic radiation in a series of
multipoles, with the dipole the lowest that produces the waves. The solution for a
wave far away from the charges that produced it is depicted in Fig. 10.2.
In the Newtonian theory of gravitation, the gravitational force does not lead to
anything like waves. In the Einstenian framework, where the space-time is a kind of
dynamical entity affected by the distribution of mass-energy as its source, the motion
of the latter produces something quite analogous to the electromagnetic case. The
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 203
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_10
204 10 Gravitational Waves
Fig. 10.2 A variable electric field induces a variable magnetic field with the same period, and
the two components satisfy a wave equation which propagates by “pushing” each other, with the
amplitude varying in the direction perpendicular to the propagation
dipole will be zero. We conclude that there is no dipole gravitational radiation, and
the lowest mode must be quadrupole.
The usual way of writing a wave equation for gravitational effects is to assume a
small (tensor) perturbation that describes the deformations of space-time in a fixed
gravitational background represented by the so-called Minkowski tensor ημν , which
is the solution for the gravitational field in the absence of matter-energy, i.e., we
focus on the radiation zone, far from the sources. This perturbation will be called
h μν . The full derivation consists in substituting ημν + h μν into the non-linear Einstein
equations and keeping the linear terms. The result is
∂2 1 2
− 2 ∇ h μν = 0 . (10.2)
∂t 2 v
This is a standard calculation and will not be repeated here. The reader may consult
[1], for example, for an appraisal of the derivation and its physical discussion, but
this is not crucial for our purposes. It is enough to observe that (10.2) has the form
of a wave equation for the quantity h μν , giving a deformation that propagates with
speed v transversely to the direction of the wave. The identification of v with the
speed of light c is almost immediate [2].
Adopting the x-axis as the propagation direction we may write, analogously to
the electromagnetic problem, a general factorized form for the amplitude:
where we have written kα x α to indicate the scalar product of the 4-wave vector with
the direction. In four dimensions, this product has the well-known form (ωt/c − k.x).
The deformation of space-time h μν , being orthogonal to the propagation, can be
further decomposed into two independent modes, conventionally denoted by + and
× (plus and cross). The associated amplitudes h + and h × can be combined to yield
Aμν according to + ×
Aμν = h + eμν + h × eμν . (10.4)
Fig. 10.3 Polarization modes h + and h × . The direction of propagation (x-axis) is perpendicular to
the plane of the page. The (+) mode, on the left, “stretches” matter in the vertical direction while
it compresses it horizontally, and the (×) mode does the same, but at an angle of 45◦ , when the
wave passes. Any deformation Aμν can be fully decomposed into these two modes, provided that
the amplitudes h + and h × are determined
206 10 Gravitational Waves
However, he doubted the reality of this result for years (Fig. 10.1). There was a
lively discussion about whether the perturbations could be made to vanish with an
appropriate coordinate change, i.e., about whether the perturbative wave solutions
might actually be spurious, stemming from a choice of coordinate system. However,
around 1950, the reality of the waves had finally been demonstrated, even though
the prospects for detecting them were considered null or at best remote.
To appreciate the actual magnitude of this physical phenomenon, we can assume
that the motion of masses at the source occurs on a characteristic timescale τ , roughly
a measure of the time a mass takes to relocate within the system. On the other hand,
the quadrupole moment is approximately the product of the mass M times the square
R 2 of the dimension of the system. Thus, we immediately find the estimate
... M R2 Mv 2 E NS
Q≈ ≈ ≈ , (10.6)
τ3 τ τ
where v is the speed of the masses in motion and E NS is the energy associated with
the non-spherical
part of the system. Moreover, for a self-gravitating system, we also
have τ ≈ R 3 /G M, and with it a characteristic frequency ν = 2π/τ = 2π f . We
can then estimate the resulting luminosity as
5 2 2
G4 M G M c5 RS v 6
L GW ≈ ≈ ≈ , (10.7)
c5 R c5 R G R c
mental purposes. With the same degree of approximation as before, the value of this
dimensionless amplitude of space-time deformation is
G E NS G Ek
h= 4 ≈ 4 , (10.8)
c r c r
where is the fraction of the kinetic energy E k actually emitted in waves. Numeri-
cally, we have
1/2
−22 E GW 1 kHz τ −1/2 15 Mpc
h = 10 . (10.9)
10−4 M c2 f GW 1 ms r
1 2
Q x y = Q yx = μa sin(2 t) , (10.10)
2
where the orbital frequency can be obtained from Kepler’s third law 2 = G M/a 3 ,
noting that the frequency of the quadrupole is twice the orbital frequency . There-
dE G 2
L GW = − = 5 μ a 2 2 sin2 (2 t) + 2 cos2 (2 t)
dt 5c
32 G 2 4 32 G 4 μ2 M 3
= μ a 6
= . (10.11)
5 c5 5 c5 a 5
We see from this that the emission grows enormously as the semi-axis a shrinks, that
is, in the advanced stages before the final merging. The total energy of the system is
1 G M1 M2 1 GμM
E= 2
Ma12 + M2 a22 − =− . (10.12)
2 a 2 a
As time goes by, gravitational radiation makes the orbit shrink at a rate obtained from
dE GμM da da 64 G 3 μM
=− =⇒ = − , (10.13)
dt 2a 2 dt dt 5 c5 a 3
where we have once again used Kepler’s third law. Since the orbital frequency
increases as 3ȧ/2a, we can integrate (10.13) to find the time for the binary sys-
tem to merge, assuming it starts from an initial semi-axis a0 , with the result
5 c5 a04
τ= . (10.14)
256 G 3 μM 4
Finally, using the above expressions, we can estimate the dimensionless amplitude
of the resulting gravitational waves as
2/3 2/3
−22 M μ f GW 15 Mpc
h = 5 × 10 , (10.15)
2.8M 0.7M 100 Hz r
where the numbers correspond to a symmetric binary system in which each com-
ponent has mass 1.4M , as is commonly assumed for neutron star binary systems
(Chap 6). We find that a coalescing binary located in one of the galaxies of the Virgo
cluster would produce a signal comparable to the burst in (10.9). However, since many
closer binaries are known, the probability of detection, at least indirectly, was imme-
diately considered. This possibility prompted a study by Hulse and Taylor which led
to the 1993 Nobel Prize in Physics and which will be described in the following.
In 1974 was a year in which relativistic Astrophysics gained a major stimulus, when
J. Taylor and R. Hulse discovered a very particular binary system: while one of
10.2 Sources of Gravitational Waves 209
the components is a pulsar, the other is also a compact object with similar mass.
Even though pulsations from the companion are not detected, it was finally identi-
fied as another neutron star. This identification became possible because the binary
was observed with high accuracy, first to determine the mass function f (M1 , M2 , i)
defined in (6.38), with Porb = 2.79 × 104 s and G M /c3 = 4.925 490 947 µs a con-
stant with dimensions of time. All the quantities on the right-hand side of (6.38)
have been very accurately measured, and the pulses are seen as advanced or delayed
depending on whether the pulsar is approaching or receding from the observer, so
the radial velocity is also known via the Doppler effect. The whole system can be
completely determined by modeling the gravitational field of the companion. The
orbit is quite eccentric and inclined at about 45◦ with respect to the line of sight—
this is the angle i in (6.38). In summary, we have a very precise “clock” (the pulsar)
orbiting a companion with a strong gravitational field, and thus a variety of general
relativistic effects allowing an accurate knowledge of the system. A representation
of PSR 1913+16 and its companion is shown in Fig. 10.5 [4].
Now, with the help of the expression (10.15), we can immediately obtain the
gravitational radiation emitted with frequency f GW = 1.1 × 10−5 Hz and amplitude
h ∼ 10−23 . These values happen to be very low and their direct detection would be
impossible at the present time. But there is a way around this: since every aspect of the
system has been very accurately measured, (10.13) can be used to calculate the rate
of change of the semi-axis, and with it the change resulting from the passage of the
neutron star through the periastron, which results in a decrease of the orbital period by
a value much greater than the accuracy attained in the measurements, including errors
and uncertainties of various origins. Therefore, by monitoring the binary pulsar, the
Fig. 10.5 Representation of PSR 1913+16 and its companion, another neutron star. The inferred
orbital parameters and masses are indicated [5, Fig. 2]
210 10 Gravitational Waves
accumulated change can be determined and the result compared with the prediction
(10.16), as shown in Fig. 10.6.
As mentioned above, Hulse and Taylor were awarded the Nobel Prize of Physics
in 1993 for their discovery of the binary pulsar and the work that demonstrated
the agreement of the decay of the orbit with the predictions of General Relativity,
indirectly confirming that gravitational radiation is being emitted from the system.
Deviations from General Relativity smaller than 0.1% are still possible, showing that
there is little room for alternative theories of gravitation, at least with regard to the
gravitational radiation emission produced in this kind of binary system. These results
were improved by an analysis of the system PSR J0737-3039A/B announced in 2003
[6], in which two pulsars move with an orbital period of 2.4 hours. The decay of the
orbit for that system was measured to be in agreement with General Relativity with
a precision of around 0.01%, and a few additional relativistic effects were measured
for the first time.
Now that we have discussed the main sources of gravitational waves and we under-
stand the central idea of space-time deformation a little better, we can present the
10.3 Gravitational Wave Detectors 211
Fig. 10.7 Direct detection of gravitational waves. Two masses M1 and M2 joined by a spring react
to a gravitational wave travelling perpendicular to the spring with a hypothetical + polarization,
like the one on the left in Fig. 10.3
principles of the detectors built since the beginning of the second half of the 20th
century, which have turned the observation of gravitational waves into a reality in
contemporary science. Basically, there are two ways to detect the passage of these
waves directly—recalling that the decay of the orbit of the system PSR 1913+16 is
an indirect approach. The first is to measure the deposition of energy transported
by the waves in a mass, through a measurement of the mechanical oscillations they
induce. The second is to monitor masses that are perturbed by the passage of the
waves, but where no deposition of energy is involved. These detectors are called
resonant masses and interferometers, respectively.
Consider the situation in Fig. 10.7, showing the simplest possibility for a resonant
detector. If the incident wave has proper frequency ω and the two-mass oscillator a
natural frequency ω0 , the equation of motion for the amplitude ξ is
ξ̇ 1
ξ̈ + + ω02 ξ = − ω2 Lh + eiωt , (10.17)
χ 2
where χ represents the friction, L is the distance between the masses, and the force
on the right-hand side stems from the incident wave, which “shakes” the oscillator.
The solution of this equation is well known and reads
ω2 Lh +
ξ(t) = − eiωt , (10.18)
2(ω02 − ω2 + iω/χ )
that is, there is a resonance when ω ≈ ω0 and the maximum amplitude is ξmax =
1/2ω0 χ Lh + . If we are to construct an actual detector, it helps to maximize the
product ω0 χ L. In this way, besides choosing a material with a high Q factor enabling
a clear measurement of the induced oscillations, the detector L must be as large as
possible, i.e., the greater the mass, the more energy will be absorbed.
Bearing this in mind, the first attempts to construct a resonant detector in the 1960s
used aluminum with a high Q and had a typical proper frequency ω0 ≈ 1650 Hz,
reaching a sensitivity of h ∼ 10−15 . This last figure is surprisingly good, equivalent
to detecting an amplitude of the order of the proton radius with a bar of length 1.5 m,
but it is still insufficient to observe any naturally occurring events. The detectors
operated at room temperature and the thermal noise was very difficult to control,
212 10 Gravitational Waves
Fig. 10.8 Left: Joe Weber working with one of his resonant bars around 1965. Credit: Special
Collections and University Archives, University of Maryland Libraries. Right: Image of the detec-
tor Nautilus, in Italy, one of the bars operated 30 years later, with improvements in the electro-
mechanical transduction, the suspension, and the suppression of thermal noise due to cryogenic
operation at T 1 K, reaching sensitivities up to h ∼ 10−19 at the center of the (narrow) operating
band. Credit: Frascati Laboratories, INFN
which is in fact what permitted the actual detections (see below) was the presence of
two Fabry–Perot cavities, one on each arm of the interferometer. This way, the laser
light is made to travel back and forth many times, increasing the effective length
of the arm well beyond the actual, physical size of about 4 km, achieving effective
lengths greater than 1000 km. This is similar to what is done in the mirror houses of
amusement parks, in which images are multiplied using a configuration of parallel
mirrors [8].
This interferometric technique can detect extremely tiny oscillations. In fact, the
amplitude at the center of the band is l = hl ∼ 10−17 cm (!) for an arm length
of 4 km. This is equivalent to 0.0001 of a proton diameter. As stated earlier, inter-
ferometers are sensitive to all arriving waveforms, but in each range of frequencies
the dominant noise that complicates the detection has different origins. Figure 10.10
shows the sensitivity curves as measured and desired for the successive data runs of
the experiment known as the Laser Interferometer Gravitational Wave Observatory
(LIGO). One can see that the curve improved in each run, allowing better sensitivi-
ties, and the collaboration hopes to reach the lower curve with an advanced design.
The best frequency, where the interferometer is most sensitive, is around 100 Hz,
where a compromise between noise sources is attained. Above that frequency, the
curve rises due to the uncertainty in the emission of photons by the laser (called
shot noise) and the sensitivity gets progressively worse at the highest end. On the
other hand, below about 30 Hz, the continual seismic vibrations of the Earth’s crust
become important, and although they can be attenuated using sophisticated mechan-
ical systems, they nevertheless limit detection at the lowest end. Therefore, the most
reasonable strategy is to try to improve performance in measuring events around the
center of the band, leaving the high and low frequencies as they are. This has indeed
been the general decision made by the LIGO collaboration during the development
of the present setup.
Bearing in mind our previous discussion of binaries, we can see from (10.11) that
the emitted luminosity in gravitational waves is proportional to the reciprocal of the
semi-axis a to the fifth power. This means that the majority of inspiraling binary
systems will in general be far from merging and will emit low frequencies (recall
Kepler’s third law 2 = G M/a 3 ). A successful detection of these systems using an
214 10 Gravitational Waves
Fig. 10.10 Sensitivity curves of the LIGO experiment [9]. The data run S6, achieving a sensitivity
of h < 10−22 around 100 Hz, has now been improved thanks to an upgrade of several components.
Various “peaks” in the curve due to several mechanical resonances of the system—not fully identified
even using the best available numerical simulations—are not shown in this figure. The collaboration
operates the interferometers near the “design” goal in black. This has been sufficient to detect several
mergers (see text). Credit: B.P. Abbott et al., Living Reviews in Relativity 19, 1 (2016)
Fig. 10.11 Orbit planned for the eLISA space interferometer, designed to minimize solar pertur-
bations and obtain a wide coverage of the sky. Credit: Cardiff University, Physics and Astronomy
Outreach [11]
interferometer would need extremely long arms since the oscillation frequency is
inversely proportional to the arm length, and also very low noise perturbation. For
these reasons, the best solution seems to be the construction of a space interferometer.
Figure 10.11 shows the main features of the project known as the evolved Laser
Interferometer Space Antenna (eLISA) under study by the European Space Agency.
One interferometer with a “mother” satellite and two “daughters” set on an equilateral
triangle of side about 1 million km orbiting 50 million km from the Earth will be
capable of detecting binaries with emission centered around 10−2 Hz by monitoring
the relative motions of the three spacecraft (Fig. 10.12). This is provided that a range
10.3 Gravitational Wave Detectors 215
of perturbations like the solar wind can be kept under control. The accuracy required
for these measurements involves measuring the position of each satellite with an error
of the order of 10−14 cm, which seems reasonable with the existing technology. Note
that the projected space interferometer, of the Michelson type, will not carry mirrors
in the “daughter” satellites, because the laser beam would be too weak to be reflected.
The idea is to retransmit the signal from each satellite in active form. At the moment,
tests are being conducted for a time horizon beyond 2030 [10]. Other missions and
experiments designed to probe the lowest frequencies are shown in Fig. 10.12.
The initial operation of the two twin LIGO interferometers and later the French–
Italian project Virgo (Fig. 10.13) raised expectations because the initial sensitivity
curves already shown in Fig. 10.10 were encouraging for the detection of events, pro-
vided that they occurred during data runs with enough intensity. The long-awaited day
finally arrived on September 14, 2015, when the two LIGO interferometers detected
a simultaneous signal, the first identified as such, and were able to discard a noise
origin. Hence began the era of gravitational wave Astrophysics [13]. Figure 10.14
shows the data. The probability of a mere coincidence (random fluctuation) rather
than a real event is extremely low, with a confidence level of more than 99.99994%,
i.e., a statistical significance greater than 5σ .
Even though there was no doubt that the event originated from a binary merger, a
comparison with theoretically simulated templates of the waveform had to be carried
out to extract the parameters characterizing the individual members and the event in
general. The accepted interpretation is that the event was produced by a merger of
two black holes of stellar mass, giving rise to a more massive object and radiating the
216 10 Gravitational Waves
Fig. 10.13 Aerial views of the LIGO interferometers in Livingston (Louisiana) and Hanford (Wash-
ington), top left and top right, respectively. Lower: VIRGO interferometer in Cascina (Italy). Credits:
Caltech/MIT/LIGO Lab and Virgo Collaboration
excess energy away mainly in the form of gravitational waves. This is the conclusion
from the analysis of the gravitational wave form in the approaching stage, yielding
the quantity known as the chirp mass M, which is a combination of the individual
10.4 Detection of Black Hole and Neutron Star Mergers: … 217
From the data for this first event GW150914, M ≈ 30M . On the other hand, the
sum of the masses must satisfy M1 + M2 > 70M in order to accurately match
the post-merging stage. The individual masses came to M1 = 36+5 −4 M and M2 =
(29 ± 4)M . Only two black holes could have produced this collision. Later on,
a few similar events were detected (for example, GW151226 in Fig. 10.14), and
it is currently possible to monitor a population of binary black holes merging at
cosmological distances.
From the same analysis of the waveform it is inferred that, after the merger
GW150914, the product had a mass of Mf = (62 ± 4)M , and therefore (3 ±
0.5)M was radiated in gravitational waves. Finally, it was concluded that the spin
of the resulting black hole reached about 2/3 of the maximum possible value, a likely
consequence of the partial transfer of the orbital angular momentum of the progenitor
black holes. A Schwarzschild black hole without spin would not match the observed
waveform. The spatial velocity of the colliding black holes near the time of the
merger was greater than 0.5c. Finally, from the observed luminosity it was possible
+160
to calculate the luminosity distance to the event, with the result 440−180 Mpc, that is,
a cosmological distance scale. Even if this was the first event of this type registered
in the history of modern Astrophysics, the results showed that the visibility of this
class of mergers reaches a considerable fraction of the whole observable Universe.
Given the continuity of the observations, the detection of events similar to
GW150914 was not surprising. Figure 10.15 depicts the set of such events observed
by the end of 2020. Black hole mergers were not expected to be frequent, and in
fact the strongest bet of the community was first to detect the merger of two neu-
tron stars. However, it is also worth noting that the individual masses of the black
holes in merging systems are much higher than those inferred for these objects in
nearby binary systems (see Fig. 6.29). In fact, the individual colliding masses are on
average greater than 20M . This fact suggests that, in the early Universe, the stars
that originated this population should have high masses. This is consistent with the
predictions of the theory of Stellar Evolution, which indicate the formation of stars
with M ≥ 100M whenever metallicity is very low. This observation shows how
important it was to open this new window on the Universe, and everything started
with the detection of GW150914 discussed here.
After the confirmation of the events discussed above, it became legitimate to ask
where were the “most likely” events expected in the community, that is, the mergers
218 10 Gravitational Waves
Fig. 10.15 Map of the events detected by the LIGO/Virgo collaboration by the end of 2018. The
first announced GW150914 is the first in blue to the left. Note the difference with the typical masses
of black holes in binary systems in our galaxy (purple dots). Credit: LIGO-Virgo/Frank Elavsky,
Aaron Geller/Northwestern
occurring in neutron star–neutron star binaries. Prior to the first science runs, the
vast majority of astrophysicists strongly believed that their detection was imminent,
but this is not what happened. However, almost two years after the detection of
GW150914, an event showing the features of a NS–NS merger was finally recorded
[14]. If judged from the point of view of the far-reaching consequences of simulta-
neous multiple observations, it was worth waiting for, as we shall see.
On August 17, 2017, all three interferometers LIGO/Virgo simultaneously detected
a strong signal. In addition, the gamma-ray satellites FERMI and INTEGRAL saw
a short gamma burst, lasting around 2 s and located at the periphery of the galaxy
NGC 4993. From the chirp mass waveform measurements, it was inferred that two
+0.04
compact objects with total mass M1 + M2 = 2.74−0.01 M had collided and merged,
although their individual masses could not be determined. However, using the same
waveform and numerical modeling, the components were found to lie in the intervals
M1 = [1.36M , 1.6M ] and M2 = [1.17M , 1.36M ], ignoring the effects of spin
(the waveform does not indicate any evidence for non-zero spin). Note that, even if
though is not mandatory, it seems likely that the two individual masses are actually
equal, with a value of 1.36M (although this is not guaranteed and has been disputed
[15, 16]). The same type of analysis showed that the event is consistent with a merger
of two neutron stars, and not with another type of event, such as a NS–BH merger.
10.4 Detection of Black Hole and Neutron Star Mergers: … 219
Fig. 10.16 Left: Detected signal from the event GW170817 (inset) and the localization of the same
in gamma-rays (center), accompanied by detections in the UV, infrared, and radio (lower) achieved
by more than 60 instruments. Credit: Robert Hurt (Caltech/IPAC), Mansi Kasliwal (Caltech), Gregg
Hallinan (Caltech), Phil Evans (NASA) and the GROWTH collaboration. Right: Optical images
obtained by the Hubble Space Telescope on August 22, 26, and 28, 2017, showing the fading optical
magnitude of the associated transient. Credit: NASA and ESA: A. Levan (U. Warwick), N. Tanvir
(U. Leicester), and A. Fruchter and O. Fox (STScI)
its redshift of 0.0099. This therefore proved that any event occurring in the Virgo
galaxy cluster, residing at half this distance, will be detectable in the future. Since
events of this type should also occur at greater distances, it will be possible to give
an independent determination of the Hubble constant H0 with an uncertainty less
than 2% within 5–10 years, since these mergers are “standard sirens” that can be
calibrated. Other tests of gravitation are also possible, including those that aim to
detect the effects of alternative theories of gravitation in the waves. For example, from
the relative delay between the gravitational signal and the electromagnetic burst, it
was inferred that gravitational effects and photons travel with the same velocity c,
with a maximal admissible difference of order 10−15 . Finally, the study of the merger
itself can bring new information about this and future events, since the observation
of a non-zero signal almost 2 s after the event (Fig. 10.14) showed that the black hole
did not form immediately. The existence of an intermediate transient state, possibly
a supermassive neutron star held in place by high rotation and viscosity, brings an
opportunity to study the behavior of matter in this extreme regime. In short, we are
witnesses of a revolutionary epoch in Astrophysics with very encouraging prospects
in this area in the near future.
References
1. A. Pais, Subtle Is the Lord: The Science and the Life of Albert Einstein (Oxford University
Press, Oxford, 2005)
2. R. Matzner, Introduction to Gravitational Waves (Springer, Dordrecht, Netherlands, 2010)
3. K. Thorne, Gravitational-wave research: Current status and future prospects. Rev. Mod. Phys.
52, 285 (1980)
4. J.M. Weisberg and J.H. Taylor, The relativistic binary pulsar B1913+16: Thirty years of obser-
vations and analysis, in Binary Radio Pulsars, ASP Conference Series 328, eds. F.A. Rasio
and I.H. Stairs (2005), p. 25
5. J.A. Batlle, R. Lopez, Revisiting the border between Newtonian mechanics and General Rela-
tivity: The periastron advance. Contrib. Sci. 10, 65–72 (2014)
6. M. Burgay et al., An increased estimate of the merger rate of double neutron stars from obser-
vations of a highly relativistic system. Nature 426, 531 (2003)
7. J. Weber, Evidence for discovery of gravitational radiation. Phys. Rev. Lett. 22, 1320 (1969)
8. P.R. Saulson, Interferometric gravitational wave detectors. Int. J. Mod. Phys. D 27, 1840001
(2018)
9. B.P. Abbott at el., Prospects for observing and localizing gravitational-wave transients with
Advanced LIGO and Advanced Virgo. Living Rev. Relati. 19, 1 (2016)
10. A. Blaut, Parameter estimation accuracies of Galactic binaries with ELISA. Astropart. Phys.
101, 17 (2018)
11. https://blogs.cardiff.ac.uk/physicsoutreach/gravitational-physics-tutorial/evolved-laser-
interferometer-space-antenna-elisa/
12. C.J. Moore, R.H. Cole, C.P.L. Berry, Gravitational-wave sensitivity curves. Class. Quant. Grav.
32, 015014 (2015)
13. B.P. Abbott et al., Observation of gravitational waves from a binary black hole merger. Phys.
Rev. Lett. 116, 061102 (2016)
14. B.P. Abbott et al., GW170817: Observation of gravitational waves from a binary neutron star
inspiral. Phys. Rev. Lett. 119, 161101 (2017)
References 221
15. J.E. Horvath, The binaries of the NS–NS merging events, in Proceedings of the Xiamen-
CUSTIPEN Workshop on the EOS of Dense Neutron-Rich Matter in the Era of Gravitational
Wave Astronomy, AIP Conf. Proc.2127 (2019), p. 020015
16. R.D. Ferdman et al., Asymmetric mass ratios for bright double neutron-star mergers. Nature
583, 211 (2020)
17. B.P. Abbott et al., Gravitational waves and gamma-rays from a binary neutron star merger:
GW170817 and GRB 170817A. Astrophys. J. Lett. 848, L13 (2017)
18. S. Covino et al., The unpolarized macronova associated with the gravitational wave event GW
170817. Nature Astron. 1, 791 (2017)
Chapter 11
Gamma-Ray Bursts
As a result of the mutual distrust between the United States and the Soviet Union
during the Cold War years, the former began launching a series of satellites carry-
ing instruments to detect clandestine nuclear tests made by their socialist rival. This
series, called the Vela satellites, operated for several years and managed to detect
several short-lived bursts of gamma-rays. But the surprise was that, instead of orig-
inating on Earth, they appeared to be arriving from space, and they did not have
a solar origin. Theory and observations of GRBs are summarized in this Chapter.
As they were part of a secret military project, these data were only made known
to the public in 1973, when the project was declassified. Klebesadel, Strong, and
Olson [1] announced this discovery and discussed the origin of these bursts in the
Astrophysical Journal Letters (Fig. 11.1).
The accumulation of data showed that the arrival directions of the photons and
the light curves were unpredictable, with regard to both the duration of the event and
its variability (Fig. 11.2). The spectra, however, showed a certain regularity, a fact
interpreted as evidence for a generic emission mechanism.
An in-depth study collecting together thousands of events was necessary to answer
these questions and to identify the origin of the bursts. The Compton Observatory
mission was launched in 1991 and continued until 2000 with four complementary on-
board instruments: OSSE (a directional scintillator, capable of measuring between
50 keV and 10 MeV), COMPTEL (a two-layer Compton telescope, somewhat sim-
ilar to an optical camera with sensitivity between 750 keV and 30 MeV), EGRET
(a spark chamber to detect the highest energies, between 30 MeV and 30 GeV), and
BATSE (Burst and Transient Spectrometer Experiment, an arrangement of 8 mod-
ules capable of locating a burst and measuring its spectrum between 20 keV and
8 MeV). All the instruments performed very well, and achieved sensitivities more
than sufficient to measure weak bursts and determine the expected flatness of the
distribution coinciding with the galactic plane. But this was not what happened: to
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 223
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_11
224 11 Gamma-Ray Bursts
Fig. 11.1 One of the gamma-ray bursts detected by the Vela 6A satellite and revealed in 1973 [1].
The background count went from around 20 to 600 photons/s and decreased again in less than 10 s,
as indicated by the arrow. © AAS. Reproduced with permission
Fig. 11.2 Light curves of bursts with varying complexity and duration, compiled from data in [2].
The presence of very short time scales (τmin ∼ 0.1–0.01 ms) in many events suggested that neutron
stars might be candidates for producing the bursts: as we saw in Chap. 3, this shorter time implies
an emitting region with a size R ≤ cτmin = 300 km. The high-energy processes around neutron
stars seemed a likely source. The measured fluences indicated released energies of up to 1042 erg in
gamma rays if the emission was assumed to be isotropic (the fluence is the flux integrated over the
duration of the event). But in the data it was not possible to distinguish any spatial bias in the burst
distribution, as would be indicated for the distribution of neutron stars associated with the galactic
disk—or a more extended region, since we saw that high proper motions are measured for many of
these objects, suggesting that they could partially populate the halo in a few million years
the surprise of the whole Astrophysics community, the spatial distribution of the
bursts never showed any spatial anisotropy, and the full catalog published by the
collaboration [2] is perfectly consistent with an isotropic distribution, without any
flattening/anisotropy at all (Fig. 11.3).
When the correlation of duration with hardness (Fig. 11.4) became clear, it was
natural to assume that the two classes were originated by different phenomena, since
hardness is a reflection of the way energy is released. However, another key element,
the spectra, showed no clear difference. This was interpreted as due to a quasi-
universality of the explosive phenomenon which depended weakly on the specific
source. In fact, in Fig. 11.5 we can see a typical spectrum and the so-called Band
parametrization (see [3] and references therein). The latter is not based on any theory,
but well reproduces the vast majority of spectra of short and long bursts. Thus, any
successful theoretical model should produce a spectrum compatible with the Band
parametrization.
11.1 The Problem of Gamma-Ray Bursts: The Most Distant Objects in the Universe? 225
Fig. 11.3 Data compiled from the final BATSE catalog [2] showing a statistically isotropic spatial
distribution, a fact that contradicts local neutron stars as sources. Starting with the announcement
of this result in 1993, and for each extension of the catalog with new bursts, it became increasingly
difficult to imagine a galactic origin, even for objects located in an extended halo—note that, if so,
there should be an excess of events in the direction of the Andromeda galaxy, but this was never
observed. Of course, there is a distribution of sources that would be “naturally” isotropic around
us: the cosmological one. But this in turn would require a gigantic luminosity, greater than 1050 erg
of gamma rays alone, and it was not obvious how to reach this scale, nor how to leak the gammas
from the source without their being degraded to lower frequencies (see below). The other important
discovery was that the duration of the bursts separated them into (at least) two classes: the “short
bursts” of up to about 2 s, and the “long bursts”, with durations of 10 s or more (see next figure).
Credit: Michael Briggs, NASA [4]
where α = −1, β = [−2, −3] and E 0 ≈ 150 keV. The value of the exponent β is
the most variable, and can reach −3 for some cases where the spectrum falls off very
quickly with the energy (Fig. 11.5).
All the ingredients were now available for the elaboration of theoretical models
that would explain the nature of the bursts and reproduce the observations of light
curves and spectra. Before confirmation, and remembering the belief that the bursts
were produced by nearby neutron stars, more than 100 theoretical models were
published, including comets falling on neutron stars, magnetic instabilities, and many
other ideas [6]. But having only gamma-ray data and no other information, it was
impossible to make any real progress in this matter. Accurate location of the bursts
was essential in order to confirm the distance scale, and with it the energy. Gamma-
ray data take time to process and the error circle was typically at least 1◦ , insufficient
for such quick follow-up research by optical telescopes.
Despite the delay of almost a decade, the launch of the Italian–Dutch BeppoSAX
mission, which carried a wide field camera operating between 2 and 30 keV, was
fundamental to establish the location of the bursts. In 1997 BeppoSAX detected a
226 11 Gamma-Ray Bursts
gamma burst (GRB 970508) and located it with an accuracy of a few arcmin [9]
thanks to the X-ray detector’s ability to collimate the incoming radiation and refine
its spatial origin (Chap. 3). Some 5 hr after the burst, an optical transient was detected
of magnitude around 20 and which rapidly faded. The host galaxy was thus identified
and the extra-galactic origin of the bursts finally demonstrated (Fig. 11.6).
It was thus clear that we were dealing with a gigantic energy scale, and that nearby
neutron star models were inadequate by several orders of magnitude. The magnitude
of the inconsistency can be quantified by stating the problem of compactness. The
detected spectra were clearly non-thermal (Band-type). Thus, the transparency to
gamma photons had to be τ̂ < 1, otherwise something like a black body would be
observed. But from the observed variability, it was also clear that the sources were
highly compact, and if they were extragalactic, energies above 1050 erg in such
compact dimensions would imply opacities τ̂ 1. The spectra and the compactness
were two incompatible things in a first analysis.
Although these problems seem insoluble at first sight, some studies immediately
found a “kinematic” way out of the problem that is generally accepted until today,
namely that the ejected material that emits the gammas moves with ultra-relativistic
228 11 Gamma-Ray Bursts
speed. If this is the case, the material is transparent in the proper reference frame,
since the optical depth τ̂ is reduced by kinematics to
α−1
max
τeff = τ̂ γ −2(α−1) , (11.3)
m e c2
where max is the maximum photon energy and the α index varies for each burst, but
generally oscillates between 2 and 3. Equation (11.3) shows that τeff can be made less
than unity if the Lorentz factor is γ > 100, which compensates for the high value
of τ̂ . For comparison, the AGN jets discussed in Chap. 8 have Lorentz factors 5–10
[10].
Kinematically an ultra-relativistic flow is observed in the laboratory frame with a
kinematic beaming factor dependent on γ (already shown in Fig. 2.13), sometimes
also called the emission cone. This should not be confused with a geometric colli-
mation (the latter may also happen in the proper frame), but has an important effect
on the evaluation of the emitted energy: if the emission cone is small, since its open-
ing angle is θ ∼ γ −1 , the actual emitted energy is much less than its isotropically
estimated value. Frail et al. [11] found that all GRBs have a unique energy of about
1050 erg when they analyzed the BATSE sample, and concluded that the emission
cone angles are less than 5◦ . This is why we often talk about equivalent isotropic
energy: a value that would be appropriate if there were no beaming and the emission
were genuinely isotropic. These equivalent isotropic energies can reach 1054 erg. But
there is yet another consequence for “alleviating” the energy problem: since most of
the bursts do not emit in our direction, their actual number must be of the order of
100 times what is observed (!).
To explain these features within a theoretical model, the so-called fireball model
was put together, with contributions from several scientists over a decade. Essentially,
the fireball model requires an event that injects almost pure “radiation” jets (i.e.,
with little baryonic content, to allow a large Lorentz factor, as required to lower the
opacity) in an episodic manner. Episodic ejection ensures that each ejected bubble
has its own Lorentz factor, and when the fastest ones reach the slowest ones ahead,
internal shocks are produced. These shocks produce the observed gamma-rays, since
the opacity is reduced according to (11.3). When this jet collides with the ISM, it
decelerates, and when the non-relativistic regime is reached, it opens geometrically
letting out softer radiation, from X-rays to radio, which constitutes the so-called
afterglow of the bursts. This scenario is shown in Fig. 11.7.
The most convincing proof of the kinematic effect of the opening leading to
the afterglow is the so-called spectral break observed in many events. Figure 11.8
illustrates this observation and its interpretation. We must point out, however, that
some bursts do not present any spectral break. It is not clear whether these are truly
isotropic, but if so, they should contain energies up to 1000 times greater than the
average. A wide variety of light curve behaviors is still present in the sample, possibly
because the variation of the basic fireball scenario requires this.
11.2 Models of the Bursts 229
Fig. 11.7 Fireball model. Two types of event produce a “central engine”, namely, a merger (short
GRB) and a collapsar (long GRB). When the ultra-relativistic jets are ejected, the internal and exter-
nal collisions produce the gamma-ray burst and the afterglow, respectively. Credit: NASA/Goddard
Space Flight Center/ICRAR
Fig. 11.8 The spectral break is the moment when the light curve changes slope in time. Left: The
break in the GRB 140903A event, which is achromatic, contrary to most of the cases [12]. Right:
The break time is assigned to the moment when the jet becomes non-relativistic and suddenly opens
wide, since θ ∼ γ −1 . This explains the more isotropic character of the afterglows when compared
with the gamma emission
230 11 Gamma-Ray Bursts
Finally, there is the question of the “event” (engine) itself, which needs to pro-
duce an ultra-relativistic flow with “short” or “long” duration as observed. After
much discussion of this issue, the basic consensus is that there are two prominent
candidates: neutron star fusion and the collapse of a very massive star to form a black
hole. In both cases, the intermediate configuration is practically the same, namely,
a black hole with a transient accretion disk, as shown in Fig. 11.7. In the case of a
neutron star merger, numerical simulations show that a duration up to 2 s is reason-
able for this event (although the efficiency of the energy conversion to gamma-rays
obtained in the jet remains somewhat uncertain). In the case of collapse (also called
a hypernova), it seems that each time a black hole is formed there will be an associ-
ated “long” burst, produced by the injection of a jet perpendicular to the disk plane.
These models need the jet to “punch” the envelope of the collapsing star, and it is not
entirely clear how this happens. But the important thing is that these two events are
consistent with the temporal bimodality of the bursts, produce an analogous inter-
mediate state (black hole plus disk, but with different duration), and launch a fireball
that emits the gamma-rays and produces afterglows in most cases. As we will see
below, certain observations suggest that this identification is feasible and that at least
these two basic models can explain the events, although new scenarios may emerge
to give more specific features in some subset of the events [13].
Like any theoretical construction, the fireball model produced by the merging of
neutron stars or hypernova events requires factual confirmation, and it is this evidence
that we will discuss here. As we saw in the previous chapter, the event GW 170817
proved that, in addition to the gravitational signal, a short-lived gamma burst is
produced [14]. Other previously known events already pointed in this direction:
GRB 130603B was a short burst with the emergence of infrared radiation one week
after the gamma event. This was interpreted as evidence that the ejected material
had produced lanthanide nuclei of high opacity, a fact that corresponds very well
to expectations when two neutron stars merge, as simulated theoretically. Thus, we
can speak of a validation of the merger model, and future events of this type should
confirm this idea.
On the other hand, there is also evidence that the hypernova model works in
practice for “long” bursts. This conclusion stems from the existence of events where
a long burst is first detected, while the spectrum is observed to “transform” after
several hours into a corresponding supernova (of type Ib or Ic, i.e., a collapse where
the progenitor has lost its envelope, discussed in Chap. 5). The first known case was
that of GRB 980425, followed by the associated SN 1998bw, which had ejection
speeds of about 30 000 km/s. Today, some forty cases of this type are known, and
the conclusion is that the “hypernova” model really works for the production of a
burst, although it is also clear that a subset of the “long” bursts does not have any
associated supernova (and this would be the case if they could also be produced
11.3 Recent Observations and Models of GRBs 231
Fig. 11.9 GRB 090423 (left) and its location in the expansion diagram of the Universe (right),
preceding most star formation (inside the dark era, before the massive formation of stars and galaxies,
as indicated). Credit: Swift X-ray Telescope NASA and NASA/WMAP
what extent they can be used for these purposes, although this has not yet prevented
work from being done along these lines.
One lesson from the history of Astronomy is that novel observational techniques
and strategies have great potential for discovery, and often lead to completely new
and unexpected phenomena. This was certainly the case for GRBs and many other
objects, and also for the more recent case of fast radio bursts (FRB).
The first confirmed report of one of such an event was a short, isolated pulse in
radio frequencies (see Fig. 11.10 upper panel), of millisecond duration, observed by
Lorimer [16], although there is a possibility that other events may have been seen
earlier by other groups. After collecting more events, it turns out that the typical
emission frequencies are around 1 GHz, and possibly down to a few hundred MHz.
On the other hand, the emission bandwidth is important to estimate the distance to
the source, as we shall see below. Because of this short duration, their occurrence can
be very high and go unnoticed, as already pointed out in [16]. Careful examination
thus revealed a hidden phenomenon.
i.e., the integrated electron density along the line of sight up to the source at a dis-
tance d. The pulse is dispersed by the electron clouds and the effect on the delay of
the limiting frequencies can be measured directly, allowing a determination of the
D M. With these expressions, it became clear that most of the events are extragalac-
tic. In some cases, D M ≥ 2000 has been inferred, corresponding to cosmological
distances. Therefore, the energy scale and the spatial distribution should correspond
to a cosmological population (Fig. 11.10).
It is clear from this evidence that the FRBs are extremely luminous, although
the energy released is not that huge because of their short duration. A comparison
with other known transitory and stationary sources may be attempted, in terms of
a spectral luminosity vs. duration diagram (Fig. 11.11), in which the vertical axis
is just the luminosity divided by the bandwidth of the emission. FRBs occupy the
upper left region, with a spectral luminosity comparable to the AGNs of Chap. 8,
and typically higher than GRBs and supernovae. Because of this feature and the
associated brightness temperature, it is agreed that they are an outstanding example
of coherent emission, separated from the incoherent sources filling the grey sector
in Fig. 11.11. This will be important for any attempt to find viable physical models
for these bursts.
Important news for the field has been presented recently, with the detection of
repeating sources (of unknown origin) and also the identification of a galactic mag-
netar (Chap. 6) as the source of nearby FRBs [19]. Figure 11.12 is a simple diagram
to illustrate the recurrent behavior of FRBs, with the magnetar-associated events (ST
200428A marked in Fig. 11.11) included in the “repeaters” set. The direction was
coincident with the magnetar SGR 1935+2154 and the small DM indicated a nearby
galactic origin.
Physical models of FRBs are presently “in the works”. While it is tempting to
associate the extragalactic/cosmological sources with some kind of catastrophe, it
must be taken into account that no high-energy emission coincident with the radio
pulse has yet been detected. Therefore, it is unlikely that FRBs are the “tail” of
some outburst like the giant flare of a magnetar in Fig. 6.22. However, there are
some promising ideas, such as the Falcke–Rezzolla scenario [20] in which the col-
234 11 Gamma-Ray Bursts
lapse of a briefly living supramassive neutron star produces a radio burst when the
magnetosphere “snaps” before disappearing behind the event horizon.
On the other hand, the association of a handful of bursts with a nearby magnetar
revived models in which a quake of the crust is the cause of a sudden energy release.
Shaking of field lines can launch Alfvén wave propagation, which eventually radiate
high up in the magnetosphere due to the curvature mechanism (similar to the syn-
chrotron expressions of Chap. 2 [21]). The detection of X-rays almost simultaneous
with the FRB pulses [22–24] holds an important clue for the generation mechanism
of the bursts, one that remains to be understood. In this sense, the FRBs are likely
related to the GRBs: the majority are cosmological, compact objects are involved
in their generation, and their historical development parallels the latter. They will
remain a hot topic for years to come.
References 235
References
1. R.W. Klebesadel, I.B. Strong, R.A. Olson, Observations of gamma-ray bursts of cosmic origin.
Astrophys. J. 182, L85 (1973). https://doi.org/10.1086/181225
2. W.S. Paciesas et al., The fourth BATSE gamma-ray burst catalog (revised). Astrophys. J. Supp.
122, 465 (1999)
3. C. Guidorzi, http://www.fe.infn.it/~guidorzi/doktorthese/node5.html (2003)
4. https://gammaray.nsstc.nasa.gov/batse/grb/skymap/
5. S.D. Barthelmy, Swift-BAT results on the prompt emission of short bursts, Phil. Trans. Roy.
Soc. A 365, 1281–1291 (9 February 2007)
6. R.J. Nemiroff, A century of gamma ray burst models, Comm. Astrophys. 17, 189 (1994).
Available at arXiv:astro-ph/9402012
7. R. Willingale, P. Mészáros, Gamma-ray bursts and fast transients. Space Sci. Rev. 207, 63–86
(2017). https://doi.org/10.1007/s11214-017-0366-4
8. A. Shahmoradi, R.J. Nemiroff, Short versus long gamma-ray bursts: a comprehensive study of
energetics and prompt gamma-ray correlations. MNRAS 451, 126 (2015)
9. L. Piro et al., Evidence for a late-time outburst of the X-ray afterglow of GB970508 from
BeppoSAX. Astron. Astrophys. 331, L41 (1998)
10. P. Mészáros, The fireball model of gamma-ray bursts. Prog. Theor. Phys. Supp. 143, 33 (2001)
11. D. Frail et al., Beaming in gamma-ray bursts: Evidence for a standard energy reservoir. Astro-
phys. J. Lett. 562, L55 (2001)
12. E. Troja et al., An achromatic break in the afterglow of the short GRB 140903A: evidence for
a narrow jet. Astrophys. J. 827, 102 (2016)
13. I. Horváth, A further study of the BATSE gamma-ray burst duration distribution. Astron.
Astrophys. 392, 791 (2002)
14. E. Waxman et al., Constraints on the ejecta of the GW170817 neutron star merger from its
electromagnetic emission. MNRAS 481, 3423 (2018)
15. R. Abbott et al., Properties and astrophysical implications of the 150M binary black hole
merger GW190521. Astrophys. J. Lett. 900, L13 (2020)
16. D. Lorimer et al., A bright millisecond radio burst of extragalactic origin. Science 318, 777
(2007)
17. E. Petroff, J.W.T. Hessels, D. Lorimer, Fast radio bursts. Astron. Astrophys. Rev. 27, 4 (2019)
18. W. Farah et al., FRB microstructure revealed by the real-time detection of FRB170827. MNRAS
478, 1209 (2018)
19. C.D. Bochenek et al., A fast radio burst associated with a galactic magnetar. Nature 587, 59
(2020)
20. H. Falcke, L. Rezzolla, Fast radio bursts: the last sign of supramassive neutron stars. Astron.
Astrophys. 562, A137 (2014)
21. M. Longair, High-Energy Astrophysics (Cambridge University Press, Cambridge, 2011)
22. S. Mereghetti et al., INTEGRAL discovery of a burst with associated radio emission from the
magnetar SGR 1935+2154. Astrophys. J. Lett. 898, L29 (2020)
23. C.K. Li et al., HXMT identification of a non-thermal X-ray burst from SGR J1935+2154 and
with FRB 200428. Nature Astronomy 5, 378 (2021)
24. M. Tavani et al., An X-ray burst from a magnetar enlightening the mechanism of fast radio
bursts. Nature Astronomy 5, 401 (2021)
Chapter 12
Cosmic Rays
The history of the discovery of cosmic rays is fascinating in itself and corresponds
to a pioneering phase of Astrophysics in the early 20th century, when Relativity
and Quantum Mechanics changed our perspective of the physical world. In this
context the Austrian physicist Victor Hess was studying the radioactivity of elements
(then only recently discovered) and related problems, and decided to attempt direct
measurement of the degree of ionization in the high atmosphere. This problem was
intriguing because there was evidence in favor of an increase in ionization with height.
Hess refined an electroscope to measure ionization and took it in person over 3 km
up with the help of balloons (Fig. 12.1) between 1911 and 1912. His data showed
that this result was correct, and that it depended little on whether the measurement
was made during the day or at night. The data obtained during a solar eclipse were
particularly important for attempts to characterize the phenomenon. The origin and
propagation of cosmic rays are addressed here, in particular, the so called Ultra-High
energy range with all the associated puzzles and questions.
When this result was confirmed, it became evident that some source of ionization
was coming from outer space. Although the energy required to explain the measured
effect was low, the possibility of high-energy ionizing particles, even outside the
accessible range of the Hess spectroscope, remained open. The bewilderment of
physicists at the time is evident if we consider the name they chose for them: cosmic
rays. Only later was it clarified that these “rays” were actually electrons, protons,
and nuclei, which justifies having spoken of “particles” in the previous sentences.
With the passing of time and considerable effort by many groups in various places
on the planet, the general form of the spectrum of these cosmic rays was eventually
detected and established. This spectrum is shown in Fig. 12.2. There are several
important features in it, some of which have been discussed and studied for over a
century [1].
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 237
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0_12
238 12 Cosmic Rays
Fig. 12.1 Victor Hess and collaborators in one of the balloon flights that led to the confirmation
that ionization increases with height in the atmosphere. Credit: V.F. Hess Society
Fig. 12.2 Spectrum of cosmic rays, from the lowest energies up to 1020 eV, the highest energies
observed. The experiments that contributed to this diagram are identified with their error bars. The
number of particles per energy range dN /dE on the vertical axis is usually multiplied by E 2 to
improve visualization. Around 3 × 1015 eV there is a change in the slope known as the “knee”, while
at 1018 eV a second change in the slope became known as the “ankle”. Above this last energy the
flux is very small, around one particle/km2 yr, and gigantic experiments are needed to accumulate
enough statistics. The center of mass energies of the Fermilab, CERN, and other experiments are
indicated by arrows, orders of magnitude lower than those produced by accelerators in nature.
Figure adapted from [2]. For an update, see [3]
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 239
One of the striking features of cosmic rays in the lower energies is that the flux
is modulated by the solar wind, which can sweep away particles of cosmic origin,
thereby “protecting” Earth’s biosphere where its intensity is greater. We could say
that Earth’s magnetic field is our protector. Without it life on the planet’s surface
would be barely possible. Bombardment by the particles discovered by Hess would
be lethal to living beings if they struck directly. On another note, the highest energy
regime is a real challenge. The detection of primaries with energies above 1020 eV is
equivalent to putting the energy of a bullet into an elementary particle (!). So nature
has some mechanism for accelerating even up to these extreme energies.
The natural question is: where do these particles come from? An examination of
the spectrum shows a change in the spectral index α at the knee and ankle, which
suggests that the origin is different above and below the energies where the changes
occur. This idea has important support from the point of view of propagation, as we
will see below.
Let us consider the relativistic momentum p = γ mv of a charged particle. As we
saw in Chap. 2, the gyroradius or Larmor radius is
p⊥ γ mv
rL = = , (12.1)
|q|B Z eB
where q is the electric charge and p⊥ the transverse component of the momentum.
Numerically, for the case of a proton (more than 90% of cosmic rays are protons),
we have
( pc/MeV)
rL = cm . (12.2)
300(B/G)
Using (12.2), we can calculate the maximum energies at which the Larmor radius
becomes equal to the scale of interest L. The particles can rotate in confined paths if
rL < L. Starting with the heliosphere with L = 100 AU and where B ∼ 10 μG, we
have E max ∼ 4 × 1012 eV. This shows that the solar modulation happens to the left
of Fig. 12.2, at low energies, as claimed.
For the typical scale of the ISM, we have instead L = 100 pc and B ∼ 5 μG, and
therefore E max ∼ 4 × 1017 eV. Particles with these energies do not suffer substantial
deflection in the galaxy. At intermediate energies, we know that the “knee” occurs at
around 1015 eV, and below around 1014 eV observations show that the cosmic rays are
totally isotropic. The interpretation is that they are confined within the galaxy. Since
above 1015 eV, confinement is not possible, it seems likely that above these primary
energies the particles come from outside the galaxy. The observed change in the α
index from 2.7 to 3 at the “knee” would be a consequence of this. It is also possible
that at 1014 eV it is the very mechanism of acceleration that changes, and that the
extragalactic component corresponds to cosmic rays arriving with E > 1018 eV.
We can quantify this change in the spectral slope a little more closely if we
assume that the primaries are all protons and that they diffuse from place to place in
240 12 Cosmic Rays
the galaxy. In a diffusion process of this type, there is a diffusion time—similar to the
one defined for the photons in (4.25)—given by the expression tesc = R 2 /D, where
D is the diffusion coefficient, which depends on the energy of the particles. A simple
but accurate expression for this quantity is D ≈ 0.1c(E/1 GeV)−1/2 . Assuming a
distance scale R 1 kpc, suitable for keeping protons within the galactic disc,
leads immediately to
−1/2
E
tesc = 4 × 107 yr . (12.3)
1 GeV
Thus, if we consider that the sources inject a spectrum dN /d ln E|inj , the observed
spectrum in the diffusive regime will be different, given approximately by
dN d Ṅ
≈ tesc , (12.4)
d ln E obs d ln E inj
which results in a prediction that the spectrum injected (accelerated) by sources has
d Ṅ /d ln E|inj ∝ E −2.2 for protons of energy around 1 GeV. We will soon see that the
acceleration by the Fermi mechanism gives a very similar value for this exponent.
A more detailed comparison between the abundances measured in the solar vicin-
ity and those of the cosmic rays can be appreciated in Fig. 12.3. We see that some
elements in the cosmic rays (open circles, full line) are much more abundant than in
the Solar System (filled circles, dashed line). In particular, lithium, beryllium, boron,
scandium, and vanadium have abundances several orders of magnitude greater in
cosmic rays. The consensual explanation for this difference is that there is a produc-
tion process, known as spallation, between the primary protons and very abundant
nuclei (such as carbon and oxygen) that fragment the latter and produce the overabun-
dant elements. Thus, lithium, beryllium, and boron are carbon and oxygen “chips”,
and the heavier ones like scandium and vanadium are “chips” of iron, which is also
abundant in the solar neighborhood.
Consider now the problem of the acceleration of primaries (see, for example,
[4]). Certainly, the simplest way to accelerate a charged particle is to submit it
to electromagnetic forces. We know in a very general way that the equation of
motion of a charged particle in an electromagnetic field is (2.22), viz., d(γ mv)/dt =
qE + q(v × B). The first term on the right can lead to a direct acceleration as long as
E = 0, but this is difficult in Astrophysics, since the process of separating charges
to produce an electric field never lasts long enough to produce a potential because it is
quickly neutralized by electrons and positrons from the environment. The exception
to this is the process of magnetic reconnection, where force lines change their config-
uration and generate very strong local electric fields. The other way of accelerating
particles is to achieve cumulative gains of energy by means of random collisions in
some medium that has the capacity to support this energy transfer to the particles.
In 1949 Enrico Fermi worked on the problem of energy gains from a parti-
cle that collides with low velocity clouds, as happens in the interstellar environ-
ment (Fig. 12.4). Fermi showed that after a long time the primary particle which
bounces back in the discontinuities shown in Fig. 12.4 gains an average energy
of E/E = (8/c)(v/c)2 . Because the gain is proportional to the small quantity
(v/c)2 , this result became known as the second order Fermi mechanism. Besides the
low efficiency, the energy losses were not taken into account. Thus, Fermi considered
the possibility of some process where the particle gained energy in each collision,
and preferably more efficiently.
The idea of this process is that a particle gains energy by going back and forth
through a shock discontinuity, as illustrated in Fig. 12.5. In a very general way,
if E = β E 0 is the average energy after each passage, and P the probability that
the particle remains in the region of acceleration, after k passages, there will be
N = N0 P k particles with energy E = E 0 β k . Therefore, it is possible to eliminate k
from these relationships by the manipulation
ln P/ ln β
ln(N /N0 ) ln P N E
= −→ = , (12.5)
ln(E/E 0 ) ln β N0 E0
Fig. 12.4 Left: A particle with the continuous path shown collides successively with clouds that
have random speeds. Right: Particles are reflected in the mid-cloud interface and gain energy when
the collision happens in situation (a) (forward), but lose in situation (b) (backward), depending on
the relative direction between the particle’s velocity and that of the cloud
242 12 Cosmic Rays
Fig. 12.5 A shock with velocity U seen in (a) the observer’s system, (b) the system moving with
the shock, (c) the upstream reference system (where matter has not yet been hit by it), and (d) the
downstream system. The particle goes back and forth, and each time sees the plasma come down
on it with velocity v = 3U/4
For each process we must calculate the energy gain per cycle to obtain P and β, and
thereby determine the shape of the injection spectrum of this process.
In the Fermi process where the particle passes through the shocks, from the down-
stream reference frame where the plasma approaches with velocity v, the energy of
the particle is
E down = γ (E + px v) , (12.7)
and therefore,
E v
= cos θ . (12.9)
E c
Now, as can be seen in Fig. 12.6, the probability that a particle strikes at an angle
between θ and θ + dθ is proportional to sin θ dθ . Its approach speed is vx = c cos θ .
The probability distribution of the angles is then p(θ ) = 2 sin θ cos θ dθ . Using this
we can calculate the angular mean of the gain in (12.9) to take into account a very
large set of particles, as we would expect in real acceleration events. This average is
E v π/2
= 2 cos2 θ sin θ dθ , (12.10)
E c 0
interpreted as the average gain of any particle passing through the shock once. As it
does not matter whether the particle went forward or backward, this gain is valid for
both cases, and a round trip cycle changes the initial energy by
E 4v
β≡ =1+ . (12.11)
E0 3c
We see that the gain is linear in the speed v, whence it was referred to as the first
order Fermi mechanism, a much more efficient process than the second order process
mentioned above.
We are now in a position to connect these results with the expected power law for
this mechanism given in (12.6). For this purpose, we have immediately that
U U
ln P = ln 1 − =− , (12.12)
c c
4v 4v U
ln β = ln 1 + ≈ = , (12.13)
3c 3c c
N (E)dE ∝ E −2 dE . (12.14)
This calculation is still rather heuristic. To obtain the “correct” spectrum we would
need to include the losses, the magnetic field of the environment, etc., of the primaries
represented by the variable N (E). The complete diffusion equation for this problem
would be something like [1]
dN ∂ N
= D∇ 2 N + b(E)N − + Q(E) , (12.15)
dt ∂E tesc
with D∇ 2 N the diffusive term (previously ignored), b(E) = −dE/dt the energy
losses (also ignored here), tesc the escape time of (12.3), and Q(E) the function that
describes the sources of the primaries. There are interesting and illustrative solutions
244 12 Cosmic Rays
of (5.14) in the literature, but they are outside the scope of this introduction (although
they delight mathematical Physics enthusiasts).
Analysis of the first order Fermi mechanism suggests that sufficiently strong and
extensive shocks (such that the Larmor radius of the particles does not exceed the
shock radius, thus ensuring a high number of passages that energize the particles)
might be candidates to accelerate the primaries. In addition, shocks should be com-
mon to explain the observed energy density of primaries, viz., around 1 eV/cm3 .
Supernova remnants meet all the conditions, and the way to check that they are
places where particles may be accelerated is very interesting: the acceleration of
electrons is easier (although subject to losses by synchrotron radiation, etc.), but
accelerated protons would produce gamma-rays by means of the reaction
where pcr is an accelerated proton in the remnant and pamb a “fixed” proton standing
in the ambient, X any hadronic species in which we have no interest, and the neutral
pion π0 inevitably decays into gammas. Thus, the acceleration signature would be
gamma emission at around 100 TeV. Sometimes this expected emission is referred to
as the pion bump. The study of supernova remnants to identify non-thermal emission
that would indicate proton acceleration and the reaction in (12.16) was an exer-
cise that took many years, but recently it was unequivocally confirmed that IC 443,
W28, Kepler, and other remnants do indeed have regions where this emission exists
(Fig. 12.7) [5]. The maximum confirmed energy is of the order 1015 eV [6], compat-
ible with the prediction that the remnants are mainly responsible for the primaries
with lower energies than those of the “knee”. There is still evidence that the shock
that advances through the ISM is spending energy precisely to accelerate particles,
that is, the sharing of energy should include these accelerated particles.
The composition aspect of primaries has been mentioned briefly before. We have
already stated that almost 90% in the energies lower than the knee are protons, 10%
Fig. 12.8 Measurements of cosmic ray events in intermediate energies up to about 30 MeV over a
period of more than a year. The increase in the number of events is clear. Credit: H. Takai
are α-particles (helium nuclei), and the rest are decreasing proportions of heavier
nuclei. Above the “knee” things are more complicated, since events are much rarer,
there are large fluctuations from event to event, and even hadronic interactions are
uncertain, especially in the higher energies above the “ankle”. We will return to this
subject in due time.
Let us finish this discussion with two interesting subjects that still have no defini-
tive answers. The first is the relationship between the cosmic ray primaries and the
biosphere. Although sunspots were regularly seen at the solar surface in the past,
there was a period until recently (2016) when the Sun was not very active, and it may
have been close to a minimum period that happens every 11 yr or so. Solar flares
frequently “sweep” the primaries away, but close to the minimum they are rarer, so
the number of primaries increases. One of the measurements that shows the increase
in the number of events, made using stratospheric balloons in California (USA), is
shown in Fig. 12.8.
The other possibility, rather more frightening, is that we are witnessing the inver-
sion of polarity of the Earth’s magnetic field, a much rarer process that happens every
(2–3) × 105 yr. The poles of the Earth exchange their positions and, in the middle
of this process, the intensity of the magnetic field goes through a minimum. Thus,
the flow of cosmic ray primaries increases, because our “protection” against these
particles decreases. It should be pointed out that this has happened countless times
in the Earth’s geological history and, as we can see, the intensity of the magnetic
field has never vanished half way through. But it is worth studying this possibility
which, if happens, is irreversible and will mark a new situation for the biosphere we
inhabit and for many millennia.
Although we have spoken of the primaries as ordinary particles, there are some
very interesting events that suggest that there are some “exotic” ones in their midst
that would be worth knowing about. Some of these very different events, which are
therefore candidates for having been initiated by exotic primaries, were detected by
a Brazil–Japan collaboration in a series of experiments conducted on Mount Cha-
caltaya, Bolivia, 5000 m above sea level [7]. At these altitudes the rarefaction of
246 12 Cosmic Rays
Fig. 12.9 Left: The Chacaltaya laboratory in Bolivia. During the winter, the snow made it impossible
to collect data in the emulsion chambers. Credit: Francesco Zaratti, Atmospheric Physics Laboratory,
La Paz, Bolivia. Right: The experiment and a Centauro event, so called because it looks like one thing
in the upper chamber and another in the lower. The most remarkable thing is that no photons were
detected, so a “normal” electromagnetic cascade had to be discarded. Figure adapted from [7, Fig.38]
the atmosphere—expressed by the low column density, or mass per unit area along
the line of sight—allows the study of primaries, mainly hadrons, that interact in the
upper atmosphere. The experiment is shown schematically in Fig. 12.9 along with a
photo of the setting.
The experiment was composed of two emulsion chambers 158 cm apart. The
emulsions were examined with a microscope to determine what kind of particle
had passed through them, with the so-called hadronic cascade as the main result. In
some events, the expected production seems very different in the upper and lower
chambers, so these were called Centauro events. In the lower chamber, the primary
of the Centauro events did not show the expected range of particle production, and
in addition the detected hadrons displayed a huge cross-momentum. The detection
threshold of the primaries was of order 1000 TeV. As there is plenty of evidence
that an ordinary hadron initiates an electromagnetic cascade (with abundant gamma
photons) and a much more “concentrated” hadronic cascade, the consensus is that
Centauro primaries are exotic, i.e., objects that do not correspond to any laboratory
observations, or rather that are produced at very high energies in certain collisions.
It should be stressed that the last possibility has never been confirmed: no Centauro
event has ever been observed in any collision experiment carried out in the laboratory.
The most real possibility, yet to be confirmed, is that the primary is a small fragment
of quarks and gluons in the deconfined state, so there should be active astrophysical
sources of these primaries. For the time being, the nature of these events should be
considered unknown.
By general convention, all events above the “ankle” of Fig. 12.2 have been referred as
ultra-high energy cosmic rays (UHECR). The question of the highest possible energy
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 247
for primaries is certainly one of the most important, since it brings with it a number
of implications for the acceleration mechanisms and also for the propagation of the
particles, as we will see below. Overall, particles that arrive with energies above
1018 eV are a problem of the greatest complexity and the utmost importance.
Primaries at these energies are characterized by a short interaction length in the
atmosphere, that is, their interaction with atmospheric nuclei takes place right away
in the upper atmosphere, giving rise to two different components. The first is an
electromagnetic cascade (also called an air shower), where the primaries produce a
series of secondary particles, which in turn produce a third generation, and so on,
until the energy in each particle is no longer sufficient to continue this process. The
other component is a hadronic cascade, in a similar process but where the “daughter”
particles are hadrons produced by strong interactions. The geometric and energetic
development of these cascades can be studied to reconstruct the nature of the primary,
its direction of arrival, and its energy, and can be understood using a simple model
that we will present below.
The model, due to Heitler [1], assumes a simple type of decay in pairs for each
level and provides a rather illustrative analytical treatment. The basic diagrams are
presented in Fig. 12.11. For the electromagnetic cascade, each time the particles
travel a column length of X EM ≈ 37.6 g cm −2 , a pair of particles is produced by
bremsstrahlung or pair production. This bifurcation continues until the inherited par-
ticles have a minimum energy of E min ≈ 86 MeV, after which they only lose energy
without producing new pairs. Thus, after n = X/ X EM bifurcations, the number of
particles in the shower is N = 2n , since it is a geometric series by hypothesis. At the
position X max , all particles reach the minimum energy E min , and the energy E 0 that the
primary brought is distributed in its descendants, satisfying Nmax = E 0 /E min . Thus,
ln(E 0 /E min )
X max ≈ X EM . (12.17)
ln 2
Fig. 12.10 Left: An ultra-high energy primary interacts with a nucleus in the upper atmosphere,
producing air showers. Center: Spatial development of the shower, which reaches maximum particle
production for a certain value X max . Right: Measurements of events (points) and comparison with a
set of detailed simulations (much more accurate than the Heitler model, in the blue band), where the
primary is supposed to be an Fe nucleus, showing the disagreement for this specific case. Figures
from [8]
248 12 Cosmic Rays
Fig. 12.11 Heitler cascade. At each level of interaction, a particle splits into two, yielding a pair
produced by electromagnetic decay (left) or two hadronic particles (right) [9, Fig. 2] CC BY 4.0
This expression can be used to obtain E 0 by measuring X max . As it is clear that the
primary must be some kind of hadron, we see that the electromagnetic shower still
dominates the energy balance by far, since the incident hadron produces many π0 that
give rise to gamma photons. Around 90% of the energy goes to the electromagnetic
cascade and 10% to the hadronic cascade, where the same reasoning can be applied
with the result
E0
X max ≈ X 0 + X EM ln . (12.18)
n(E)
Here, n(E) is the average number of secondary particles and each of them carries
a fraction E/n(E) of the energy. An example of the comparison between (12.18)
and an actual event has already been shown in Fig. 12.10 (right panel). Of course, the
simulations actually used are much more complex and include many important effects
and corrections, but the Heitler model serves to show the essence of this procedure.
Having a better appreciation of the showers produced by the ultra-high energy
primaries, we can move on to the question of their actual detection. We have already
seen that their flux is extremely low—around 1 particle/km2 yr—and from this data
we know only that experiments must have an enormous effective area for detection, of
the order of hundreds of km2 . To measure the electromagnetic and hadronic showers,
we must be able to determine trajectories of secondary particles accurately. The
contemporary paradigm of this type of experiments is the Pierre Auger Observatory
near the Andes in Argentina (Fig. 12.12). Another important facility is the Telescope
Array Project in Utah (USA), an observatory that uses over 500 distributed plastic
scintillators and fluorescence detectors with the same purpose of investigating ultra-
high energy primaries [10]. We will discuss the specific case of the Pierre Auger
Observatory in what remains of this Chapter.
The Auger laboratory operates a very large number of surface detectors and fluo-
rescence detectors spread over an area of about 50 × 50 km2 . The surface detectors
are plastic tanks, each containing around 12 ton of pure water, together with the
relevant electronics and communications antenna to time the events. The tanks are
used for detection of Čerenkov radiation (Chap. 2) from the passage of muons in
the electromagnetic cascade. These are called the “penetrating component” because
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 249
Fig. 12.12 Left: Area covered by the Pierre Auger Observatory, near the city of Malargüe in
Argentina. Right: One of the plastic water tanks that constitutes the arrangement of surface detectors.
Some guests in the image will later contribute to the fame of the Argentine barbecue [8]. Credit:
Pierre Auger Observatory
of the low cross-section of the muons, which allows them to reach ground level.
Detection operates all the time, with a 100% duty cycle. Fluorescence detectors, on
the other hand, observe the tenuous radiation produced by the deposition of energy
by showers in the air, with the subsequent de-excitation and production of photons.
The device is shown schematically in the form of an optical collector in Fig. 12.13.
As it needs good conditions to operate, its duty cycle is just 13% of the total time,
corresponding to moonless nights.
These two types of detectors are complementary and their joint measurements
provide precise information about events. While surface detectors measure lateral
development—i.e., the data set of an event in several detectors allows us to see
the cascade developing—relative arrivals in each tank are used to reconstruct the
direction of arrival, and finally the energy is determined using a calibrated relationship
between X max and the signal at around 1000 m from the extrapolated center of arrival
of the primary (there is no substantial error in disregarding greater distances because
the energy carried there is too small). However, fluorescence detectors active only
250 12 Cosmic Rays
Fig. 12.14 Signal detected by the surface detectors (left), which register the shower extension
(upper right) and the fluorescence detectors (right), showing the development that reaches the
maximum multiplicity X max . The actual signal is shown in the upper right corner [8]. Credit:
Pierre Auger Observatory
when there is no Moon and the sky is too dark to see the radiation produced far from
them, record the sequential evolution and allow an alternative energy measurement
by integrating over the shower profile, i.e., a calorimetric measurement. The recorded
signal is shown in Fig. 12.14 for both cases.
Construction of the Pierre Auger Observatory was completed in 2008 and has
been accumulating events with energies E > 1 EeV = 1018 eV, up to the highest
energies of around 75 EeV or more, the maximum observed values. In fact one of
the biggest enigmas in this area was precisely the existence of this maximum energy.
Empirically, the data exhibit a sharp drop in the number of events for E > 60 EeV,
visible in Fig. 12.15 (see, for example, [11]). This observation is in line with the
Fig. 12.15 Spectrum of primaries observed at the highest energies. The sudden drop in the number
of events above around 6 × 1019 eV, interpreted as the GZK cutoff, is clearly visible (the edge is
marked with a blue star) [11]. Credit: K.-H. Kampert
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 251
p + γCMB → + → p + π0 , (12.19)
p + γCMB → + → n + π+ . (12.20)
Thus, the maximum distance from which a proton can arrive with 60 EeV energy is
less than 50 Mpc, regardless of injection energy. This calculation still underestimates
the distance because it does not take into account the multiple production of pions.
The radius of the sphere RGZK = 50 Mpc—known as the GZK sphere—signals the
maximum distance at which the primaries can be injected. Suppression with 20σ
statistical significance is assigned to the detected GZK cutoff (Fig. 12.15). How-
ever, an alternative explanation is also possible: that the cutoff is unrelated to the
GZK sphere and is due to the acceleration mechanism reaching a maximum. There
is as yet no definitive exploration of this last hypothesis (see [12] and references
therein).
Therefore, the existence of a small number of events with higher energy than
the GZK cutoff is also robust. Thus, these events should (a) sometimes originate at
smaller distances or (b) be produced by primaries that are not hadrons, in order to
originate from beyond the GZK sphere and avoid the photo-production of pions. And
under any hypothesis, we still have the problem of identifying the sources of these
events beyond and below the observed cutoff.
The source of acceleration for these extreme energies is not at all obvious. We
have already seen that a sufficient electrical field would be difficult to obtain, due
to the tendency of the charges to neutralize potential differences—called unipolar
induction in the jargon. But the alternative, a first-order Fermi mechanism, also needs
an important condition to be satisfied in order to work in this regime: the source must
have a size L larger than the Larmor radius rL , in such a way that the primaries do
not escape easily and can reach energies of order 1020 eV. As the particles are ultra-
relativistic, E max = pmax c, and this condition L ≥ rL = pmax /Z eβ can be written as
where the product ec is equal to unity in natural units and β < 1 is the acceleration
efficiency when taking into account losses through synchrotron radiation, and so
on. Using this expression, one can ask which real systems can produce protons or
252 12 Cosmic Rays
nuclei of a given energy. The result is known as the Hillas diagram and is shown in
Fig. 12.16.
From the Hillas diagram, we see that the only viable sources in our galaxy, and yet
still subject to the efficiency problem, are neutron stars possessing extreme magnetic
fields (the magnetars of Chap. 6). It is tempting to conclude that the ultra-high energy
regime is due to extragalactic primaries, as anticipated and discussed for nearly a
century.
Although the strongest candidates for the accelerating sources are the AGNs from
Chap. 8, and the fact that there is a very close AGN, namely the radiogalaxy Cen A
at a distance of about 4 Mpc, it has not been possible to confirm that the arrival
directions of the higher-energy primaries point statistically to the AGNs (the Auger
collaboration announced such a correlation, but this result was not subsequently
confirmed). The Cen A lobes are about 60 kpc in size (see Fig. 8.3) and it would not
be surprising if it were able to contribute a good fraction of the detected primaries
(see below). More recently, the Auger collaboration published a paper [14] in which
they claim to detect an anisotropy in arrival directions consistent with extragalactic
gamma sources known as starburst galaxies (many of them in collision, with their
position indicated in Fig. 12.16).
Regardless of these considerations, one should question the analysis used to recon-
struct events, which includes numerous theoretical ingredients, among them proton–
nucleus or nucleus–nucleus interactions for energies much higher than those ever
encountered in laboratories. Figure 12.17 shows that, with these interactions, pre-
dictions of the number of muons in cascades are greatly underestimated, a result
confirmed by the simulation of the more inclined showers, which are totally dom-
inated by muons. The actual number produced is more than 20% higher than that
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 253
Fig. 12.17 Comparison between simulated proton-initiated (red dots) and iron-initiated (blue dots)
showers for low (left) and near-horizontal (right) inclinations. The events show an evident lack of
agreement. From [16]
indicated by the predictions, and suggests the need for a (difficult) revision of the
interactions in the extreme regime.
The Auger data also allow direct investigation of the question of composition
through X max and associated quantities. The basic question is: are they protons or
are they nuclei? Is it possible to determine their nature? Due to the existence of large
fluctuations from event to event, it is most suitable to do statistics with a large set of
events for each energy bin, and then compare with the simulations (subject to caveats
about the interactions that affect the results, as we just pointed out). Figure 12.18
is the current answer to this question: although close to 1 EeV the composition is
compatible with protons, at higher energies the primaries seem to be heavier, viz., α
particles and later close to Fe. This is rather surprising and needs to be clarified if we
Fig. 12.18 Transition from a “light” composition (protons) to a “heavy” composition (nuclei) at
higher energies, consistent with the simulated data in both the X max variable (left) and the lateral
development (right). From [16]
254 12 Cosmic Rays
are to progress in this area, since as we have seen from the GZK sphere argument,
nuclear primaries can be “brought” from closer distances, even though the sources
may be more numerous according to the Hillas diagram.
The last major issue we will address here, which has consequences for the identifi-
cation of sources, is the question of making “images” using cosmic rays, in the sense
of knowing whether the primaries point to the sources that accelerated them. This is
what has allowed the development of optical Astronomy, where the neutral photons
do not suffer any deviation by the magnetic fields in the intergalactic environment
and in the halos of the galaxies. However, we know that the primaries need to be
electrically charged (otherwise they could not be accelerated, at least by the mecha-
nisms discussed here). These magnetic fields are uncertain, but polarization data of
extragalactic objects show their existence. The intensity is of the order of 10−9 G for
the intergalactic medium, and up to 10−6 G for galactic halos (including our own).
The geometry, however, is much more difficult to evaluate. It is precisely for this
reason that the arrival of ultra-high energy primaries is an interesting tool, since it
has the potential to show deviations in the trajectories that “distort the images” of
the sky and sources.
Figure 12.19 shows the paths of the primaries for various energy values, prop-
agating in an intergalactic environment with field intensity BIGM = 10−9 G and a
correlation length of 1 Mpc. We see that the deviations decrease greatly over the
range 10–100 EeV, and the paths there are almost straight. Thus, at the highest ener-
gies, the primaries would point to the sources. A previous study has already shown
this effect [15], especially if the halo field of our galaxy is ignored. But under any
hypothesis, a deviation of 1–2◦ will remain in the “images” of sources, which are
imperfect and in fact slightly distorted.
With the operation of the Auger Observatory and the resulting accumulation
of data, it is now a consensus that the observed dipolar anisotropy establishes an
Fig. 12.19 Left: Trajectories of ultra-high energy primaries for increasing energies and a fixed
intergalactic magnetic field of 10−9 G (left). Right: Angular deviation for the same field, adding
10−6 G in the halo (upper curve) and a demagnetized halo (lower curve), for an extragalactic source
at D = 50 Mpc [15]
12.1 Messengers from the Greatest Accelerators in the Universe: Cosmic Rays 255
extragalactic origin of particles in the energy range above 8 × 1018 eV [16] since the
dipole direction is deviated by 125◦ from the direction of the galactic center. Thus
we have another important piece in the general puzzle.
We have not discussed here other interesting issues, such as the possibility that
the primaries are exotic neutral particles of cosmological origin. But it is clear that
Nature has produced a “natural laboratory” that no terrestrial acceleration experiment
could ever rival, thus providing a unique opportunity to access the world of ultra-
high energies. Indeed, it has already given results and promises to be of the utmost
importance in helping us to understand the functioning of the Universe.
References
1. T.K. Gaisser, Cosmic Rays and Particle Physics (Cambridge University Press, Cambridge, UK,
1991)
2. https://masterclass.icecube.wisc.edu/en/analyses/cosmic-ray-energy-spectrum
3. F. Schröder, News from Cosmic Ray Air Showers, ICRC 2019—Cosmic Ray Indirect Rapport.
https://arxiv.org/abs/1910.03721
4. M. Bustamante et al., CERN Latin-American School on High Energy Physics. http://cern.ch/
PhysicSchool/LatAmSchool/2009/Presentations/pDG1.pdf (2009)
5. N. Tsuji et al., Systematic study of acceleration efficiency in young supernova remnants with
nonthermal X-ray observations. Astrophys. J. 907, 117 (2021)
6. M. Longair, High-Energy Astrophysics (Cambridge University Press, Cambridge, 2011)
7. C.M. Lattes, Y. Fujimoto, S. Hasegawa, Hadronic interactions of high energy cosmic-ray
observed by emulsion chambers. Phys. Repts. 65, 151 (1980)
8. Home page of the Pierre Auger Observatory. https://www.auger.org/
9. P. Abreu, S. Andringa, F. Diogo, and M.C. Espírito Santo, Questions and Answers in Extreme
Energy Cosmic Rays – a guide to explore the data set of the Pierre Auger Observatory, Nuclear
and Particle Physics Proceedings 273–275, 1271–1275 (2016)
10. Home page of the Telescope Array collaboration. www.telescopearray.org/
11. K.-H. Kampert, Proceedings of the 7th International Workshop on Very High Energy Particle
Astronomy in 2014 (VHEPA2014), JPS Conf. Proc. 15, 011004 (2017). https://journals.jps.jp/
doi/pdf/10.7566/JPSCP.15.011004
12. D. Harari, Ultra-high energy cosmic rays. Phys. Dark Univ. 4, 23 (2014)
13. P. Bhattacharjee, G. Sigl, Origin and propagation of extremely high energy cosmic rays. Phys.
Rept. 327, 109 (2000)
14. A. Aab et al., (Pierre Auger Collaboration), An indication of anisotropy in arrival directions of
ultra-high-energy cosmic rays through comparison to the flux pattern of extragalactic gamma-
ray sources. Astrophys. J. Lett. 853, L29 (2018)
15. G.A. Medina-Tanco, E.M. de Gouveia Dal Pino, and J.E. Horvath, Non-diffusive propagation
of ultra high energy cosmic rays. Astropart. Phys. 6, 337 (1997)
16. D. Gora, The Pierre Auger Observatory: Review of latest results and perspectives. Universe 4,
128 (2018)
Problems
Selected Problems
The solution of problems is an integral part of the study of any modern science
topic. The problems below are a minimal set to encourage understanding and allow
the student to gain confidence in the many subjects described in the book. The answers
are not given, because slightly different answers will be possible at different levels of
approximation and depending on the methodological approach, an intrinsic feature
of scientific work that the student should learn to live with. However, in some cases
a hint is provided.
1) Taking kB = c = = 1, convert the following quantities to the natural unit system
in powers of MeV:
a) T = 8000 K
b) ρ = 2.7 × 1014 g cm−3
c) m = 10 kg
2) Which conservation law(s) are violated by the reaction n → p + e− ?
a) energy
b) linear momentum
c) angular momentum
d) lepton number
e) electric charge
f) baryon number
Given that the relevant masses are m n = 939.6 MeV/c2 , m p = 938.3 MeV/c2 , and
m e = 0.511 MeV/c2 , write down the correct neutron decay reaction.
3) Explain in some detail the differences between a baryon, a meson, and a lepton.
Are they all truly elementary? In what sense?
4) Calculate the Compton wavelength of Schroedinger’s cat in units of the Compton
wavelength of the electron, assuming its mass is 3 kg.
5) The lifetime of an excited state of a nucleus is of order 10−12 s. What is the
uncertainty in the energy of a photon from its decay?
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 257
Nature Switzerland AG 2022
J. E. Horvath, High-Energy Astrophysics, Undergraduate Lecture Notes in Physics,
https://doi.org/10.1007/978-3-030-92159-0
258 Problems
14) Search the literature for the most intense synchrotron source in the sky today.
How is this emission supposed to arise in this specific case?
15) If we want an angular resolution of 1/10 of the object size, estimate in how many
years we will be able to study the remaining SN1987A in the LMC with optical
telescopes that reach 0.1
16) Assuming Sedov’s expansion precedes the expansion in the snowplow phase,
find the radius R where this regime changes for a supernova remnant. What is the
speed of the ejected material at that radius?
17) Given a power-law distribution of energies of injected particles with index p
between E min and E max , in a region with a field intensity B, calculate the synchrotron
emissivity and decay time for each energy.
18) A hydrogen plasma from a companion star accretes onto a white dwarf with
radius 8000 km and mass 0.5M . The accretion rate is 10−9 M /yr. The plasma is
guided by the magnetic poles of the white dwarf falling onto 1% of the surface. The
energy of the material that fell from infinity goes to zero as the material is stopped
by a shock just above the surface, so that the material accommodates itself on the
surface just below. The region at 1 m depth, just below the shock, effectively absorbs
the energy from the material. This region therefore contains a hot plasma, is optically
thin, and has a density of 105 gcm−3 .
a) What is the number density of ions n i in the shock region?
b) Calculate the potential energy lost per second (in erg/s) by the material coming
from “infinity”.
c) This energy is converted into thermal energy in the post-shock region. What is
the power deposited per cm3 of this region?
d) The power radiated by this region is equal to the accreted power in the stationary
state. Assuming that all the emission is thermal, calculate the temperature and the
emission peak.
e) Assume now that all the radiated power is bremsstrahlung with Gaunt factor
geff = 1. Calculate the equilibrium temperature T of the plasma and the highest
emitting band.
260 Problems
19) Suppose that the distribution of protons in the center of the Sun suddenly becomes
monoenergetic, with the form f (E) = Aδ(E − E ∗ ). Evaluate the rate of nuclear
reactions and calculate the value of the effective exponent β for this new situation.
20) Build a qualitative description of the differences between gravitational collapse
supernovas and thermonuclear supernovas. Include in your description comments
on the origin of each of these phenomena, the spectral differences, the chemical
products that result from them, and so on.
21) Assume that the Sun has a surface temperature of 5700 K, a radius of 1.4 ×
1011 cm, and a mass of 2 × 1033 g.
a) Using the Stefan–Boltzmann law, find the rest mass lost per second in the form
of radiation.
b) What mass fraction is emitted as radiation every year? How many years can
the Sun survive such a loss?
22) An X-ray source moves in the sky with very high, but still undetermined proper
motion. The Monomono Satellite observes a line at 3.8 keV that could be assigned
to Fe IV (4.1 keV). Discuss what kind of observation you would need to make to
determine whether the line is affected by:
a) Doppler effect,
b) gravitational redshift.
23) Find the velocity of a massive object that approaches along the line of sight in
such a way that the Doppler effect compensates for the gravitational redshift due to
the ratio M/R.
24) The equation of state for a semi-degenerate gas can be written in the form
P = Kρ 5/3 1 + η + η2 /(1 + η) dyn/cm2 ,
while the Henyey track, the final stage before the MS, turning left once the proto-star
is radiative and in quasi-equilibrium, takes the form
Using the fact that all stars are “black bodies” to a good approximation, obtain the
Hayashi track as a function of M and R. Using the fact that, in the lower MS,
R ∝ M 3/7 , and normalizing with respect to the Sun, calculate the constant in the
second equation, imposing the condition that the Henyey track ends (obviously) in
ZAMS. Plot (approximately) the two paths in the HR L vs. T diagram.
28) The figure below shows the HR diagram of a 1M star at the ZAMS. Indicate
the following stages on the diagram and complete the statements where necessary:
a) The moment when the core reaches the Schoenberg–Chandrasekhar limit.
b) The Hertzprung gap region.
c) The path on the giant branch, physically due to ….
d) The position of the helium flash, if present. This phenomenon consists of
…because of the …condition.
e) The region where stationary burning of helium occurs in the stellar core, known
in the literature as ….
29) To a first approximation, a white dwarf can be treated as a stellar object with
constant temperature that doesn’t generate luminosity. Integrate the hydrostatic bal-
262 Problems
ance and mass continuity equations for a given density profile ρ = ρ0 (1 − r/R) and
construct a family of solutions as a function of the central density ρ0 . Plot the density
and determine the radius and the mass of the WD for ρ0 = 108 g cm−3 . Compare
with the empirical values published in the literature.
30) The following questions concern the important timescales for the life and death
of a star:
• τnuc
a) What does τnuc represent?
b) Calculate the mass of hydrogen available for fusion throughout the whole life
of the Sun. Assuming that 70% of its composition is H, will the whole of that
amount be consumed?
c) What is the physical origin of the Schoenberg–Chandrasekhar limit?
d) What is the total nuclear energy available, assuming that the mass of the proton
is m p = 1.67 × 1024 g and the mass of the α-particle is m α = 6.644 × 1024 g?
e) Using M and L to denote the mass and luminosity of the Sun, what is the
value of τnuc for the Sun? Does it coincide with its lifetime on the main sequence?
If not, what is the reason for the difference?
• τKH
a) What does τKH represent?
b) The mathematical expression for this timescale is
E int |E grav | G M2
τKH = ≈ ≈ . (P.3)
L 2L r RL
What approximations have been made to arrive at the last expression?
• τdyn
a) What does τdyn represent?
b) What events occur in stars on this timescale?
c) This quantity may be estimated by calculating the time for the stellar envelope
to fall onto the center. Carry out such an estimate for the Sun. Hint: recall the
Newtonian expression for the gravitational force for the envelope shell.
Comparing the timescales:
a) What is the hierarchy of these timescales in a stable star?
b) When the Sun turns into a red giant, its radius will increase to about 200R
and its luminosity to about 3000L . Estimate τKH and τdyn for this stage.
Problems 263
c) Indicate the most relevant timescale for each event in the right-hand column of
the following table:
Event Timescale
Contraction pre-MS
Supernova
H burning in the core
He burning in the core
31) Using the Virial Theorem and the constant density hypothesis, show that the
stellar temperature follows approximately the following dependency:
If the ignition of any nuclear cycle happens for fixed temperature, what can we say
about the density at which each cycle for increasing star mass begins?
32) Given a spherical Newtonian star of mass M and radius R, suppose that the density
is given by ρ = ρ0 (1 − r/R), that the ideal gas state equation P = (γ − 1)cV ρT is
valid everywhere, and that there is a uniform magnetic field of intensity B throughout
the star volume. Calculate the internal energy and show that the virial theorem is
satisfied.
33) Determine the equilibrium radius of a white dwarf by minimizing the energy (as
shown in Chap. 6) and imposing E F → pF2 /2m ∝ 1/R 2 for the Fermi energy. Is this
radius compatible with observations?
34) Show that the maximum number of electrons that results in a stable structure
for a white dwarf (associated with the Chandrasekhar mass) depends only on the
fundamental constants G, , and c.
35) Using the approximate numerical value ξ1 = 3 for the zero of the Lane–Emden
function, determine the radius of a polytropic model of a white dwarf, given that its
index is n = 1. Compare with the Earth radius.
36) The density of a toy “neutron star” ρ0 = 3M∗ /4π R∗3 , with M∗ = 2.8 × 1033 g
and R∗ = 106 cm, can be treated as independent of the radius to a good approx-
imation. Using this observation as hypothesis, integrate the Newtonian structure
equations (hydrostatic balance and mass contained in a sphere of radius r ) to obtain
the central pressure PC . Compare the result obtained to the value for more sophisti-
cated stellar models (realistic state equations, general relativity, etc.), which give on
average PC = 6 × 1034 dyn/cm2 . How do you view the result?
37) Use the “cold” Newtonian structure equations to obtain a differential equation
for the density as a function of radius for a linear state equation P = Aρ + B. Solve
it, plot the solutions, and compare with the polytropic solutions already discussed
for n = 5/3.
38) Briefly explain the concept of the Tolman–Oppenheimer–Volkoff mass. What is
its astrophysical application? Hint: Think of the observed masses in Fig. 6.18.
264 Problems
39) The equation of motion of a pulsar that loses energy solely by electromagnetic
dipole emission is
d(I )
= −K B 2 3 , (P.5)
dt
where I is the moment of inertia, B the magnetic field, and K a constant. Use this
equation to estimate the magnetic field of the pulsar with the red circle on the figure
below (adapted from [1, Fig. 2]). Hint: If the period P is expressed √
in seconds and its
derivative Ṗ in s s−1 , and remembering that P = 2π/, we have I /K = 1015 G.
40) Verify that the event horizon area of a black hole is 4π RS2 . Hint: Remember that
the radial coordinate r is not the distance to the center. Use the Schwarzschild metric
as starting point.
41) The equation
GM
v = rω = (P.6)
R
describes the speed of a massive particle orbiting a black hole without rotation.
However, it can be shown that the orbit is not stable unless r ≥ 3RS . Any disturbance
will lead the particle in a smaller orbit to spiral down to the event horizon.
Problems 265
a) Find the speed of a particle in the smallest stable orbit around a 10M black
hole.
b) Find the orbital period for the same orbit. Compare with the “year” of the planet
Mercury, which lasts 88 days.
42) A neutron star with period P = 1 s has mass M = 1.4M , constant density,
and radius R = 10 km. The neutron star is accreting mass from a binary companion
through an accretion disk at a rate Ṁ = 10−9 M yr −1 . Suppose the material is in a
circular Keplerian orbit around the neutron star until the moment it reaches the sur-
face, and that at this moment all the angular momentum of the material is transferred
to the neutron star.
a) Write down a differential equation for Ṗ , the rate at which the period of the
neutron star decreases.
b) Solve the equation to find how long it takes to reach P = 1 ms, which is about
the maximum rotation rate of a neutron star.
43) An accreting compact object of mass M is radiating at the Eddington luminosity
corresponding to that mass. An astronaut wearing a white space suit is at rest at an
arbitrary distance from the compact object. Assuming that the projected area of the
astronaut’s body is A = 1.5 m2 , find the maximum mass of the astronaut such that
the radiation pressure prevents his/her fall onto the compact object.
44) Suppose the Sun collapsed to the size of a neutron star (R = 10 km).
a) Assuming that no mass is lost in the collapse, find the period of rotation of this
neutron star.
b) Find the intensity of this neutron star’s magnetic field.
Although our Sun will not end its life as a neutron star, this shows that the conservation
of angular momentum and magnetic flux can easily produce magnetic fields and
pulsar-type rotation speeds, at least in principle.
45) Combining gravitation (G), thermodynamics (kB ), and quantum mechanics (),
Stephen Hawking calculated the temperature TH of a black hole without rotation to
be given by
c3 c
kB TH = = , (P.7)
8π G M 4π RS
55) Imagine a gas composed exclusively of electrons. By what process would this
gas radiate? What would happen if an equal number of positrons were present? How
would you distinguish the two cases in equilibrium at the same temperature?
56) An ultra-relativistic electron is injected in a region with a uniform magnetic field.
How does its energy change with time? What happens when the electron reaches the
Newtonian limit?
57) A source shows a spectrum that fits ν 5/2 at low frequencies, shifts to ν 2 and reaches
a peak, dropping as ν −0.75 after that. Is this spectrum consistent with synchrotron
emission by relativistic electrons? How do you interpret each behavior?
58) Consider a black hole accreting mass with a fixed efficiency η and always radi-
ating at the Eddington limit. Show that the growth would be exponential and that the
timescale is determined by atomic constants alone. (This monstrous behavior is one
of the “fears” mentioned in Chap. 8.)
59) Show that N point masses of mass m in the gravitational field of a massive object
of mass M achieve a (local) minimum of the total energy when the masses are all in
the same circular orbit. Hint: Think of using Lagrangian multipliers.
60) The massive multiple star η Carina may be radiating at its Eddington limit.
Estimate the mass needed to explain the observed luminosity of 5 × 106 L .
a) In 1837, a “great eruption” happened and η Carinae reached m V ∼ 0. Assuming
an interstellar extinction of 1.7 mag, without any bolometric correction, estimate the
luminosity during this eruption.
b) Calculate the total energy released in photons to sustain the great eruption for
around 20 yr, as observed.
c) Read the appraisal by Hirai et al. arXiv:2011.12434 (2020) for some tentative
conclusions regarding the physical origin of this eruption and the Carinae system.
61) The flux of neutrinos from the supernova SN1987A was measured to be about
1.3 × 1010 cm−2 .
a) If the average energy per neutrino was 4 MeV, use these numbers to calculate
the total energy released in neutrinos in this event.
b) Estimate how many human beings acted as a “detector”, capturing 1 neutrino
from the SN1987A. Hint: Consider the cross-sections of Chap. 9 and approximate
an average person by a mass of pure water.
62) Equate the pressure of an ideal electron gas to the degeneracy pressure to find
the conditions for the onset of electron degeneracy. Repeat for an ultra-relativistic
gas with v = c and compare.
63) Find the size of the Moon if it suddenly converts into a white dwarf.
64) Determine the shortest period of a pulsar in the Newtonian approximation,
keeping the fluid spherical and setting R = 10 km (shaky approximations indeed).
Compare with the period of the fastest known pulsar PSR J17482446ad, for which
f = 716 Hz.
268 Problems
65) Describe what would happen to the orbits of the planets if the Sun suddenly
collapsed to a black hole.
66) a) Could the Sun actually be a black hole radiating Hawking emission? What
would its mass have to be for the Hawking emission peak to coincide with the Sun’s
emission peak?
b) If this were the case, what would the “gravitational constant” G have to be for
the Earth’s orbit to remain the same?
c) Estimate your own weight on the Earth’s surface with this new value of G.
67) In some binary systems, the conditions are such that the angular momentum
can be considered to be conserved. Use this approximation to show that the rate of
change of the orbital period is
1 dP M1 − M2
= 3 Ṁ1 . (P.9)
P dt M1 M2
68) Consider a photosphere that is being carried away by a shell travelling with
velocity v .
a) Show that the density of the photosphere at a distance r is ρ = Ṁej /4πr 2 v .
b) Within the approximation of a constant mean opacity κ̄ inside the expanding
shell with outer radius R at t = 0, show that if the radius of the photosphere (with
optical depth τ = 2/3) was R0 , then
1 1 1
= − , (P.10)
R R0 R∞
1 1 1
= − . (P.11)
R + vt Rph (t) R∞
vt (1 − R0 /R∞ )2
Rph (t) = R0 + . (P.12)
1 + (vt/R∞ )(1 − R0 /R∞ )
vt
Rph (t) ≈ . (P.13)
1 + vt/R∞
f) Apply this expression to understand the nova explosion shown in Fig. 7.9.
Problems 269
69) Find the photosphere temperature of a nova explosion assuming the Eddington
luminosity limit for the event.
70) The Alfvén radius is defined as the location at which the magnetic energy density
equals the kinetic energy density of the falling material, stopping the fall of matter
onto a magnetized object (Chap. 7). Show that the Alfvén radius can be written as
L = L 0 (1 + z)α , (P.15)
with L 0 the average luminosity today, i.e., L 0 ≡ L(z = 0), and assuming also that
α ≈ 2, find how luminous a quasar at z = 2.2 was (Fig. 8.6).
72) A blazar is an AGN which is thought to point a jet towards the Earth. If the
redshift of a target blazar is z B and the redshift of the ejecta is z ej , show that the speed
of the ejecta relative to the blazar itself is given by
v (1 + z B )2 − (1 + z ej )2
= . (P.16)
c (1 + z B )2 + (1 + z ej )2
73) Show that when a “normal” object (star, asteroid, etc.) approaches a black hole,
the Roche limit beyond which it will be disrupted by the gravitational field is
1/3
ρ̄BH
rR = 2.4 RS . (P.17)
ρ̄normal
Reference
1. V.M. Kaspi, Grand unification of neutron stars. PNAS 107(16), 7147–7152 (2010)
Index
Convection, 70, 71, 75 Event horizon, 145, 146, 150, 151, 154
Cooling time, 33 Event Horizon Telescope (EHT), 151, 152
Corotation radius, 169 Evolved Laser Interferometer Space
Cosmic Microwave Background (CMB), Antenna (eLISA), 214
251
Cosmic rays, 237–240, 245, 254
Coupling constant, 6, 7 F
Crab nebula, 37 Fabry–Perot, 213
Crab pulsar, 97 Fanarhoff–Riley galaxies, 186
Cross-section, 22, 24–27 Fast Radio Bursts (FRBs), 232–234
Crystallization, 126, 129 Fermi level, 198
Cumulative gains, 241 FERMI satellite, 219
Curvature radiation, 38 Fermi theory, 3, 14–16
Cyclotron frequency, 35 Feynman diagrams, 15
Cyg X-1, 156, 157 Fireball model, 228–230
First order Fermi mechanism, 243, 244
Fluorescence, 22
D Flux, 19, 29, 30
Dark matter, 33, 41, 42 Focus, 50, 51, 53
Dark stars, 145 Formation of the heavy elements, 219
Davies experiment Homestake, 191 Free expansion phase, 108
De Broglie wavelength, 65 Free–free process, 69
Deep Underground Neutrino Experiment
(DUNE), 196, 201
Deflagrations, 100, 101 G
Deflagration-to-Detonation Transition Galaxy clusters, 33
(DDT), 101 GALLEX collaboration, 191
Degenerate regime, 79 Gamma-Ray Bursts (GRBs), 223, 224, 226–
Detached systems, 164 234
Detonations, 99–101 Gamow peak, 65, 66
Diffusion, 69, 70, 73 Gauge theories, 15, 16
Diffusion time, 240 Gaunt factor, 33
Dimensionless amplitude, 207, 208 Giant branch, 78–81, 85
Direct acceleration, 241 Giants, 61, 78
Dispersion measure, 233 Globular cluster, 123, 144
Dispersion relations, 39 Gravitation, 7, 8, 13, 14
Doppler effect, 41 Gravitational pressure, 134
Double-degenerate, 97, 98, 102, 105 Gravitational redshift, 41, 42
Dynamical timescale, 73 Gravitational waves, 204, 206, 208, 210–
Dynamic range, 49 213, 215–217, 219
Graviton, 8, 13, 14
Grazing incidence, 50
E Greisen–Zatsepin–Kuzmin (GZK) cutoff,
Eddington luminosity, 168 250, 251
Effective potential, 162, 163 GW150914, 215–218
Effective temperature, 59, 72, 77 GW170817, 217, 219
Electrodes, 48
Electromagnetic cascade, 246–248
Electromagnetic interactions, 6, 7, 13, 16 H
Electron capture, 92, 105, 106 Hamuy–Phillips calibration, 103
Elementary particle, 1, 4, 8, 14 Hawking radiation, 147, 148
Emission cone, 228 Heavy water, 192, 193
Equation of state, 71 Heitler cascade model, 248
Equivalent isotropic energy, 228 Helicity, 17
Index 273
L O
Lagrangian points, 163 Onion structure, 83, 84
Landau–Darrieus instability, 101 Opacity, 69, 70, 77
Lane–Emden equation, 117, 118, 120 Optical depth, 228
Lanthanides, 219 Outer crust, 137
Larmor formula, 30, 35
Larmor radius, 239, 244, 251
Lepton, 8, 9 P
Light curve, 45, 46 Pair instability, 105––107
274 Index
T V
Thermal decoupling, 82 Vacuum fluctuations, 5
Thermal neutrinos, 197 Variability, 45
Thermal pulses, 81, 82 VELA satellites, 223
Thermal timescale, 73 Velocity dispersion, 184, 185
Thermonuclear supernovae, 97, 100, 104, Virgo galaxy cluster, 207, 220
106 VIRGO observatory, 212
Thin shell approximation, 109 Virial theorem, 72–74, 83
Thompson limit, 24 Virtual particles, 5, 6
Tolman—Oppenheimer–Volkoff equation Viscosity, 166–168
(TOV), 130
Total emissivity, 31
Transducers, 212 W
Transparency, 227 Weak interactions, 7, 14–16
Triple-α process, 68, 80, 82 Wein’s displacement law, 29
Tunnel effect, 64, 65 White dwarfs, 113–117, 119, 120, 122–131,
134, 136, 138–140, 144
White dwarf seismology, 129
Work function, 21, 22
U
Ultra-High Energy Cosmic Rays (UHECR),
246 X
Ultraviolet catastrophe, 19 X-ray binaries, 114, 152–154
Uncertainty relations, 4–6, 10, 16
Unified model, 181, 182
Unipolar induction, 251 Y
Upper main sequence, 74 Yukawa potential, 132