Stan Functions Reference - Stan Development Team
Stan Functions Reference - Stan Development Team
Version 2.33
Overview ix
Built-In Functions 1
1. Void Functions 2
1.1 Print statement 2
1.2 Reject statement 2
ii
4. Complex-Valued Basic Functions 38
4.1 Complex assignment and promotion 38
4.2 Complex constructors and accessors 38
4.3 Complex arithmetic operators 39
4.4 Complex comparison operators 41
4.5 Complex (compound) assignment operators 42
4.6 Complex special functions 42
4.7 Complex exponential and power functions 44
4.8 Complex trigonometric functions 45
4.9 Complex hyperbolic trigonometric functions 46
5. Array Operations 48
5.1 Reductions 48
5.2 Array size and dimension function 52
5.3 Array broadcasting 53
5.4 Array concatenation 54
5.5 Sorting functions 55
5.6 Reversing functions 56
6. Matrix Operations 57
6.1 Integer-valued matrix size functions 57
6.2 Matrix arithmetic operators 58
6.3 Transposition operator 61
6.4 Elementwise functions 62
6.5 Dot products and specialized products 64
6.6 Reductions 67
6.7 Broadcast functions 70
6.8 Diagonal matrix functions 71
6.9 Container construction functions 72
6.10 Slicing and blocking functions 74
6.11 Matrix concatenation 76
6.12 Special matrix functions 78
6.13 Gaussian Process Covariance Functions 79
6.14 Linear algebra functions and solvers 86
6.15 Sort functions 93
6.16 Reverse functions 94
Appendix 261
References 264
Overview
This is the reference for the functions defined in the Stan math library and available
in the Stan programming language.
The Stan project comprises a domain-specific language for probabilistic programming,
a differentiable mathematics and probability library, algorithms for Bayesian posterior
inference and posterior analysis, along with interfaces and analysis tools in all of the
popular data analysis languages.
In addition to this reference manual, there is a user’s guide and a language reference
manual for the Stan language and algorithms. The Stan User’s Guide provides
example models and programming techniques for coding statistical models in Stan.
The Stan Reference Manual specifies the Stan programming language and inference
algorithms.
There is also a separate installation and getting started guide for each of the Stan
interfaces (R, Python, Julia, Stata, MATLAB, Mathematica, and command line).
Interfaces and platforms
Stan runs under Windows, Mac OS X, and Linux.
Stan uses a domain-specific programming language that is portable across data
analysis languages. Stan has interfaces for R, Python, Julia, MATLAB, Mathematica,
Stata, and the command line, as well as an alternative language interface in Scala.
See the web site https://mc-stan.org for interface-specific links and getting started
instructions
Web site
The official resource for all things related to Stan is the web site:
https://mc-stan.org
The web site links to all of the packages comprising Stan for both users and develop-
ers. This is the place to get started with Stan. Find the interface in the language you
want to use and follow the download, installation, and getting started instructions.
GitHub organization
Stan’s source code and much of the developer process is hosted on GitHub. Stan’s
organization is:
ix
x
https://github.com/stan-dev
Each package has its own repository within the stan-dev organization. The web site
is also hosted and managed through GitHub. This is the place to peruse the source
code, request features, and report bugs. Much of the ongoing design discussion is
hosted on the GitHub Wiki.
Forums
Stan hosts message boards for discussing all things related to Stan.
https://discourse.mc-stan.org
This is the place to ask questions about Stan, including modeling, programming, and
installation.
Licensing
• Computer code: BSD 3-clause license
The core C++ code underlying Stan, including the math library, language, and
inference algorithms, is licensed under the BSD 3-clause licensed as detailed in each
repository and on the web site along with the distribution links.
• Logo: Stan logo usage guidelines
Acknowledgements
The Stan project could not exist without the generous grant funding of many grant
agencies to the participants in the project. For more details of direct funding for the
project, see the web site and project pages of the Stan developers.
The Stan project could also not exist without the generous contributions of its users in
reporting and in many cases fixing bugs in the code and its documentation. We used
to try to list all of those who contributed patches and bug reports for the manual here,
but when that number passed into the hundreds, it became too difficult to manage
reliably. Instead, we will defer to GitHub (link above), where all contributions to the
project are made and tracked.
Finally, we should all thank the Stan developers, without whom this project could
not exist. We used to try and list the developers here, but like the bug reporters,
once the list grew into the dozens, it became difficult to track. Instead, we will defer
to the Stan web page and GitHub itself for a list of core developers and all developer
contributions respectively.
Built-In Functions
1
1. Void Functions
Stan does not technically support functions that do not return values. It does support
two types of statements, one printing and one for rejecting outputs.
Although print and reject appear to have the syntax of functions, they are actually
special kinds of statements with slightly different form and behavior than other
functions. First, they are the constructs that allow a variable number of arguments.
Second, they are the the only constructs to accept string literals (e.g., "hello
world") as arguments. Third, they have no effect on the log density function and
operate solely through side effects.
The special keyword void is used for their return type because they behave like
variadic functions with void return type, even though they are special kinds of
statements.
2
1.2. REJECT STATEMENT 3
4
2.1. INTEGER-VALUED ARITHMETIC OPERATORS 5
operator-(x) = −x
T operator-(T x)
Vectorized version of operator-. If T x is a (possibly nested) array of integers, -x is
the same shape array where each individual integer is negated.
Available since 2.31
int operator+(int x)
This is a no-op.
operator+(x) = x
int int_step(int x)
2.3. BOUND FUNCTIONS 7
int int_step(real x)
Return the step function of x as an integer,
(
1 if x > 0
int_step(x) =
0 if x ≤ 0 or x is N aN
int size(real x)
Return the value x truncated to an integer. This will throw an error if the value of x
is too big to represent as a 32-bit signed integer.
This is similar to trunc (see Rounding functions) but the return type is of type int.
For example, to_int(3.9) is 3, and to_int(-3.9) is -3.
Available since 2.31
I to_int(data T x)
The vectorized version of to_int. This function accepts a (possibly nested) array of
reals and returns an array of the same shape where each element has been truncated
to an integer.
Available since 2.31
3. Real-Valued Basic Functions
This chapter describes built-in functions that take zero or more real or integer
arguments and return real values.
9
10 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
When exp is applied to an array, it applies elementwise. For example, the statement
above,
y2 = exp(x2);
vector[5] yv;
row_vector[7] yrv;
matrix[10, 20] ym;
yv = exp(xv);
yrv = exp(xrv);
ym = exp(xm);
Arrays of vectors and matrices work the same way. For example,
array[12] matrix[17, 93] u;
z = exp(u);
After this has been executed, z[i, j, k] will be equal to exp(u[i, j, k]).
Integer and integer array arguments
Integer arguments are promoted to real values in vectorized unary functions. Thus if
n is of type int, exp(n) is of type real. Arrays work the same way, so that if n2 is a
3.1. VECTORIZATION OF REAL-VALUED FUNCTIONS 11
It would be illegal to try to assign exp(n1) to an array of integers; the return type is
a real array.
Binary function vectorization
Like the unary functions, many of Stan’s binary functions have been vectorized, and
can be applied elementwise to combinations of both scalars or container types.
Scalar and scalar array arguments
When applied to two scalar values, the result is a scalar value. When applied to two
arrays, or combination of a scalar value and an array, vectorized functions like pow()
are defined elementwise. For example,
// declare some variables for arguments
real x00;
real x01;
array[5] real x10;
array[5]real x11;
array[4, 7] real x20;
array[4, 7] real x21;
// ...
// declare some variables for results
real y0;
array[5] real y1;
array[4, 7] real y2;
// ...
// calculate and assign results
y0 = pow(x00, x01);
y1 = pow(x10, x11);
y2 = pow(x20, x21);
When pow is applied to two arrays, it applies elementwise. For example, the statement
above,
12 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
y2 = pow(x20, x21);
Alternatively, if a combination of an array and a scalar are provided, the scalar value
is broadcast to be applied to each value of the array. For example, the following
statement:
y2 = pow(x20, x00);
vector[5] yv;
row_vector[7] yrv;
matrix[10, 20] ym;
yv = pow(xv00, xv01);
yrv = pow(xrv, x00);
ym = pow(x00, xm);
3.1. VECTORIZATION OF REAL-VALUED FUNCTIONS 13
Arrays of vectors and matrices work the same way. For example,
array[12] matrix[17, 93] u;
z = pow(u, x00);
After this has been executed, z[i, j, k] will be equal to pow(u[i, j, k], x00).
Input & return types
Vectorised binary functions require that both inputs, unless one is a real, be containers
of the same type and size. For example, the following statements are legal:
vector[5] xv;
row_vector[7] xrv;
matrix[10, 20] xm;
While the vectorized binary functions generally require the same input types, the
only exception to this is for binary functions that require one input to be an integer
and the other to be a real (e.g., bessel_first_kind). For these functions, one
argument can be a container of any type while the other can be an integer array, as
long as the dimensions of both are the same. For example, the following statements
are legal:
14 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
vector[5] xv;
matrix[5, 5] xm;
array[5] int xi;
array[5, 5] int xii;
real e()
e, the base of the natural logarithm
Available since 2.0
real sqrt2()
The square root of 2
Available since 2.0
real log2()
The natural logarithm of 2
Available since 2.0
3.3. SPECIAL VALUES 15
real log10()
The natural logarithm of 10
Available since 2.0
real positive_infinity()
Positive infinity, a special non-finite real value larger than all finite numbers
Available since 2.0
real negative_infinity()
Negative infinity, a special non-finite real value smaller than all finite numbers
Available since 2.0
real machine_precision()
The smallest number x such that (x + 1) ̸= 1 in floating-point arithmetic on the
current hardware platform
Available since 2.0
target acts like a function ending in _lp, meaning that it may only may only be
used in the model block.
Boolean operators
Boolean operators return either 0 for false or 1 for true. Inputs may be any real or
integer values, with non-zero values being treated as true and zero values treated as
false. These operators have the usual precedences, with negation (not) binding the
most tightly, conjunction the next and disjunction the weakest; all of the operators
bind more tightly than the comparisons. Thus an expression such as !a && b is
interpreted as (!a) && b, and a < b || c >= d && e != f as (a < b) || (((c
>= d) && (e != f))).
int operator!(int x)
Return 1 if x is zero and 0 otherwise.
(
0 if x ̸= 0
operator!(x) =
1 if x = 0
int operator!(real x)
Return 1 if x is zero and 0 otherwise.
(
0 if x ̸= 0.0
operator!(x) =
1 if x = 0.0
deprecated
Available since 2.0, deprecated in 2.31
deprecated
Available since 2.0, deprecated in 2.31
step is a step-like functions; see the warning in section step functions applied to
expressions dependent on parameters.
Available since 2.0
int is_inf(real x)
Return 1 if x is infinite (positive or negative) and 0 otherwise.
Available since 2.5
int is_nan(real x)
Return 1 if x is NaN and 0 otherwise.
Available since 2.5
Care must be taken because both of these indicator functions are step-like and
thus can cause discontinuities in gradients when applied to parameters; see section
step-like functions for details.
(x + y) = operator+(x, y) = x + y
(x − y) = operator-(x, y) = x − y
(x ∗ y) = operator*(x, y) = xy
3.7. STEP-LIKE FUNCTIONS 21
operator-(x) = (−x)
T operator-(T x)
Vectorized version of operator-. If T x is a (possibly nested) array of reals, -x is
the same shape array where each individual number is negated.
Available since 2.31
real operator+(real x)
Return the value of x.
operator+(x) = x
R fdim(T1 x, T2 y)
Vectorized implementation of the fdim function
Available since 2.25
Bounds functions
real fmin(real x, real y)
Return the minimum of x and y; see warning above.
(
x if x ≤ y
fmin(x, y) =
y otherwise
R fmin(T1 x, T2 y)
Vectorized implementation of the fmin function
Available since 2.25
R fmax(T1 x, T2 y)
Vectorized implementation of the fmax function
Available since 2.25
Arithmetic functions
real fmod(real x, real y)
Return the real value remainder after dividing x by y; see warning above.
x
fmod(x, y) = x − y
y
R fmod(T1 x, T2 y)
Vectorized implementation of the fmod function
Available since 2.25
Rounding functions
Warning: Rounding functions convert real values to integers. Because the output is
an integer, any gradient information resulting from functions applied to the integer
is not passed to the real value it was derived from. With MCMC sampling using HMC
or NUTS, the MCMC acceptance procedure will correct for any error due to poor
gradient calculations, but the result is likely to be reduced acceptance probabilities
and less efficient sampling.
The rounding functions cannot be used as indices to arrays because they return real
values. Stan may introduce integer-valued versions of these in the future, but as of
now, there is no good workaround.
R floor(T x)
floor of x, which is the largest integer less than or equal to x, converted to a real
value; see warning at start of section step-like functions
Available since 2.0, vectorized in 2.13
24 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
R ceil(T x)
ceiling of x, which is the smallest integer greater than or equal to x, converted to a
real value; see warning at start of section step-like functions
Available since 2.0, vectorized in 2.13
R round(T x)
nearest integer to x, converted to a real value; see warning at start of section step-like
functions
Available since 2.0, vectorized in 2.13
R trunc(T x)
integer nearest to but no larger in magnitude than x, converted to a double value;
see warning at start of section step-like functions
Available since 2.0, vectorized in 2.13
R cbrt(T x)
cube root of x
Available since 2.0, vectorized in 2.13
R square(T x)
square of x
Available since 2.0, vectorized in 2.13
R exp(T x)
natural exponential of x
Available since 2.0, vectorized in 2.13
R exp2(T x)
base-2 exponential of x
Available since 2.0, vectorized in 2.13
R log(T x)
natural logarithm of x
Available since 2.0, vectorized in 2.13
R log2(T x)
base-2 logarithm of x
3.9. TRIGONOMETRIC FUNCTIONS 25
R log10(T x)
base-10 logarithm of x
Available since 2.0, vectorized in 2.13
pow(x, y) = xy
R pow(T1 x, T2 y)
Vectorized implementation of the pow function
Available since 2.25
R inv(T x)
inverse of x
Available since 2.0, vectorized in 2.13
R inv_sqrt(T x)
inverse of the square root of x
Available since 2.0, vectorized in 2.13
R inv_square(T x)
inverse of the square of x
Available since 2.0, vectorized in 2.13
R hypot(T1 x, T2 y)
Vectorized implementation of the hypot function
Available since 2.25
26 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
R cos(T x)
cosine of the angle x (in radians)
Available since 2.0, vectorized in 2.13
R sin(T x)
sine of the angle x (in radians)
Available since 2.0, vectorized in 2.13
R tan(T x)
tangent of the angle x (in radians)
Available since 2.0, vectorized in 2.13
R acos(T x)
principal arc (inverse) cosine (in radians) of x
Available since 2.0, vectorized in 2.13
R asin(T x)
principal arc (inverse) sine (in radians) of x
Available since 2.0
R atan(T x)
principal arc (inverse) tangent (in radians) of x, with values from −π/2 to π/2
Available since 2.0, vectorized in 2.13
R sinh(T x)
hyperbolic sine of x (in radians)
Available since 2.0, vectorized in 2.13
R tanh(T x)
hyperbolic tangent of x (in radians)
3.11. LINK FUNCTIONS 27
R acosh(T x)
inverse hyperbolic cosine (in radians)
Available since 2.0, vectorized in 2.13
R asinh(T x)
inverse hyperbolic cosine (in radians)
Available since 2.0, vectorized in 2.13
R atanh(T x)
inverse hyperbolic tangent (in radians) of x
Available since 2.0, vectorized in 2.13
R inv_logit(T x)
logistic sigmoid function applied to x
Available since 2.0, vectorized in 2.13
R inv_cloglog(T x)
inverse of the complementary log-log function applied to x
Available since 2.0, vectorized in 2.13
R erfc(T x)
28 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
R inv_erfc(T x)
inverse of the complementary error function of x
Available since 2.29, vectorized in 2.29
R Phi(T x)
standard normal cumulative distribution function of x
Available since 2.0, vectorized in 2.13
R inv_Phi(T x)
Return the value of the inverse standard normal cdf Φ−1 at the specified quantile x.
The details of the algorithm can be found in (Wichura 1988). Quantile arguments
below 1e-16 are untested; quantiles above 0.999999999 result in increasingly large
errors.
Available since 2.0, vectorized in 2.13
R Phi_approx(T x)
fast approximation of the unit (may replace Phi for probit regression with maximum
absolute error of 0.00014, see (Bowling et al. 2009) for details)
Available since 2.0, vectorized in 2.13
R binary_log_loss(T1 x, T2 y)
Vectorized implementation of the binary_log_loss function
Available since 2.25
R owens_t(T1 x, T2 y)
Vectorized implementation of the owens_t function
Available since 2.25
R beta(T1 x, T2 y)
Vectorized implementation of the beta function
Available since 2.25
R lbeta(T1 x, T2 y)
Vectorized implementation of the lbeta function
Available since 2.25
30 CHAPTER 3. REAL-VALUED BASIC FUNCTIONS
R tgamma(T x)
gamma function applied to x. The gamma function is the generalization of the
factorial function to continuous variables, defined so that Γ(n + 1) = n!. See for a
full definition of Γ(x). The function is defined for positive numbers and non-integral
negative numbers,
Available since 2.0, vectorized in 2.13
R lgamma(T x)
natural logarithm of the gamma function applied to x,
Available since 2.0, vectorized in 2.15
R digamma(T x)
digamma function applied to x. The digamma function is the derivative of the natural
logarithm of the Gamma function. The function is defined for positive numbers and
non-integral negative numbers
Available since 2.0, vectorized in 2.13
R trigamma(T x)
trigamma function applied to x. The trigamma function is the second derivative of
the natural logarithm of the Gamma function
Available since 2.0, vectorized in 2.13
R lmgamma(T1 x, T2 y)
Vectorized implementation of the lmgamma function
Available since 2.25
R gamma_p(T1 x, T2 y)
Vectorized implementation of the gamma_p function
Available since 2.25
R gamma_q(T1 x, T2 y)
Vectorized implementation of the gamma_q function
Available since 2.25
R choose(T1 x, T2 y)
Vectorized implementation of the choose function
Available since 2.25
R bessel_first_kind(T1 x, T2 y)
Vectorized implementation of the bessel_first_kind function
Available since 2.25
where
Jv (x) cos(vπ) − J−v (x)
Yv (x) =
sin(vπ)
R bessel_second_kind(T1 x, T2 y)
Vectorized implementation of the bessel_second_kind function
Available since 2.25
modified_bessel_first_kind(v, z) = Iv (z)
where
∞
v X 1 2 k
1 4z
Iv (z) = z
2 k!Γ(v + k + 1)
k=0
R modified_bessel_first_kind(T1 x, T2 y)
Vectorized implementation of the modified_bessel_first_kind function
Available since 2.25
an integer.
Available since 2.26
R log_modified_bessel_first_kind(T1 x, T2 y)
Vectorized implementation of the log_modified_bessel_first_kind function
Available since 2.26
where
π I−v (z) − Iv (z)
Kv (z) = ·
2 sin(vπ)
R modified_bessel_second_kind(T1 x, T2 y)
Vectorized implementation of the modified_bessel_second_kind function
Available since 2.25
where
Γ(x + 1)
(x)n =
Γ(x − n + 1)
R falling_factorial(T1 x, T2 y)
Vectorized implementation of the falling_factorial function
Available since 2.25
pronounced “x choose y.” This function generalizes to real numbers using the gamma
function. For 0 ≤ y ≤ x,
R lchoose(T1 x, T2 y)
Vectorized implementation of the lchoose function
Available since 2.29
where
Γ(x + n)
x(n) =
Γ(x)
R rising_factorial(T1 x, T2 y)
Vectorized implementation of the rising_factorial function
Available since 2.25
R log_rising_factorial(T1 x, T2 y)
Vectorized implementation of the log_rising_factorial function
Available since 2.25
fma(x, y, z) = (x × y) + z
ldexp(x, y) = x2y
R ldexp(T1 x, T2 y)
Vectorized implementation of the ldexp function
Available since 2.25
R lmultiply(T1 x, T2 y)
Vectorized implementation of the lmultiply function
Available since 2.25
R log1p(T x)
natural logarithm of 1 plus x
Available since 2.0, vectorized in 2.13
R log1m(T x)
natural logarithm of 1 minus x
Available since 2.0, vectorized in 2.13
R log1p_exp(T x)
natural logarithm of one plus the natural exponentiation of x
Available since 2.0, vectorized in 2.13
R log1m_exp(T x)
logarithm of one minus the natural exponentiation of x
Available since 2.0, vectorized in 2.13
R log_diff_exp(T1 x, T2 y)
Vectorized implementation of the log_diff_exp function
Available since 2.25
R log_sum_exp(T1 x, T2 y)
Return the natural logarithm of the sum of the natural exponentiation of x and the
natural exponentiation of y.
R log_inv_logit(T x)
natural logarithm of the inverse logit function of x
Available since 2.0, vectorized in 2.13
R log_inv_logit_diff(T1 x, T2 y)
natural logarithm of the difference of the inverse logit function of x and the inverse
logit function of y
Available since 2.25
R log1m_inv_logit(T x)
natural logarithm of 1 minus the inverse logit function of x
Available since 2.0, vectorized in 2.13
R lambert_wm1(T x)
Implementation of the W−1 branch of the Lambert W function, i.e., solution to the
function W−1 (x) expW−1 (x) = x
Available since 2.25
4. Complex-Valued Basic Functions
This chapter describes built-in functions that operate on complex numbers, either as
an argument type or a return type. This includes the arithmetic operators generalized
to complex numbers.
38
4.3. COMPLEX ARITHMETIC OPERATORS 39
complex to_complex()
Return complex number with real part 0.0 and imaginary part 0.0.
Available since 2.28
Complex accessors
Given a complex number, its real and imaginary parts can be extracted with the
following functions.
real get_real(complex z)
Return the real part of the complex number z.
Available since 2.28
real get_imag(complex z)
Return the imaginary part of the complex number z.
Available since 2.28
complex operator-(complex z)
Return the negation of the complex argument z, which for z = x + yi is
−z = −x − yi.
T operator-(T x)
Vectorized version of operator-. If T x is a (possibly nested) array of complex
numbers, -x is the same shape array where each individual value is negated.
Available since 2.31
Binary operators
complex operator+(complex x, complex y)
Return the sum of x and y,
(x + y) = operator+(x, y) = x + y.
(x − y) = operator-(x, y) = x − y.
(x ∗ y) = operator*(x, y) = x × y.
is equivalent to
get_real(z1) == get_real(z2) && get_imag(z1) == get_imag(z2)
As with other complex functions, if one of the arguments is of type real or int, it
will be promoted to type complex before comparison. For example, if z is of type
complex, then z == 0 will be true if z has real component equal to 0.0 and complex
component equal to 0.0.
Warning: As with real values, it is usually a mistake to compare complex numbers
for equality because their parts are implemented using floating-point arithmetic,
which suffers from precision errors, rendering algebraically equivalent expressions
not equal after evaluation.
int operator==(complex x, complex y)
Return 1 if x is equal to y and 0 otherwise,
(
1 if x = y, and
(x == y) = operator==(x, y) =
0 otherwise.
This function works elementwise over containers, returning the same shape and
4.6. COMPLEX SPECIAL FUNCTIONS 43
kind of the input container but holding reals. For example, a complex_vector[n]
input will return a vector[n] output, with each element transformed by the above
equation.
Available since 2.28, vectorized in 2.30
real arg(complex z)
Return the phase angle (in radians) of z, which for z = x + yi is
real norm(complex z)
Return the Euclidean norm of z, which is its absolute value squared, and which for
z = x + yi is
norm(z) = abs2 (z) = x2 + y 2 .
complex conj(complex z)
Return the complex conjugate of z, which negates the imaginary component, so that
if z = x + yi,
conj(z) = x − yi.
Z conj(Z z)
Vectorized version of conj. This will apply the conj function to each element of a
complex array, vector, or matrix.
Available since 2.31
complex proj(complex z)
Return the projection of z onto the Riemann sphere, which for z = x + yi is
(
z if z is finite, and
proj(z) =
0 + sign(y)i otherwise,
complex log(complex z)
Return the complex natural logarithm of z, which for z = polar(r, θ) is
complex log10(complex z)
Return the complex common logarithm of z,
log z
log10 z = .
log 10
Z pow(T1 x, T2 y)
Vectorized implementation of the pow function
Available since 2.30
complex sqrt(complex x)
Return the complex square root of x with branch cut along the negative real axis.
For finite inputs, the result will be in the right half-plane.
Available since 2.28
complex sin(complex z)
Return the complex sine of z,
exp(z i) − exp(−z i)
sin(z) = −sinh(z i) i = .
2i
complex tan(complex z)
Return the complex tangent of z,
(exp(−z i) − exp(z i)) i
tan(z) = −tanh(z i) i = .
exp(−z i) + exp(z i)
complex acos(complex z)
Return the complex arc (inverse) cosine of z,
1 p
acos(z) = π + log(z i + 1 − z 2 ) i.
2
complex asin(complex z)
Return the complex arc (inverse) sine of z,
p
asin(z) = − log(z i + 1 − z 2 ) i.
complex atan(complex z)
Return the complex arc (inverse) tangent of z,
1
atan(z) = − (log(1 − z i) − log(1 + z i)) i.
2
complex sinh(complex z)
Return the complex hyperbolic sine of z,
exp(z) − exp(−z)
sinh(z) = .
2
complex tanh(complex z)
Return the complex hyperbolic tangent of z,
sinh(z) exp(z) − exp(−z)
tanh(z) = = .
cosh(z) exp(z) + exp(−z)
complex acosh(complex z)
Return the complex hyperbolic arc (inverse) cosine of z,
p
acosh(z) = log(z + (z + 1)(z − 1)).
complex asinh(complex z)
Return the complex hyperbolic arc (inverse) sine of z,
p
asinh(z) = log(z + 1 + z 2 ).
complex atanh(complex z)
Return the complex hyperbolic arc (inverse) tangent of z,
log(1 + z) − log(1 − z)
atanh(z) = .
2
5.1. Reductions
The following operations take arrays as input and produce single output values.
The boundary values for size 0 arrays are the unit with respect to the combination
operation (min, max, sum, or product).
Minimum and maximum
real min(array[] real x)
The minimum value in x, or +∞ if x is size 0.
Available since 2.0
48
5.1. REDUCTIONS 49
Norms
real norm1(vector x)
The L1 norm of x, defined by
PN
norm1(x) = n=1 (|xn |)
real norm1(row_vector x)
The L1 norm of x
Available since 2.30
real norm2(vector x)
The L2 norm of x, defined by
q
PN 2
norm2(x) = n=1 (xn )
real norm2(row_vector x)
The L2 norm of x
Available since 2.30
Quantile
Produces sample quantiles corresponding to the given probabilities. The smallest
observation corresponds to a probability of 0 and the largest to a probability of 1.
Implements algorithm 7 from Hyndman, R. J. and Fan, Y., Sample quantiles in
Statistical Packages (R’s default quantile function).
real quantile(data array[] real x, data real p)
The p-th quantile of x
Available since 2.27
then calling dims(x) or dims(y) returns an integer array of size 3 containing the
elements 7, 8, and 9 in that order.
The size() function extracts the number of elements in an array. This is just the
top-level elements, so if the array is declared as
array[M, N] real a;
the size of a is M.
The function num_elements, on the other hand, measures all of the elements, so
that the array a above has M × N elements.
The specialized functions rows() and cols() should be used to extract the dimen-
sions of vectors and matrices.
array[] int dims(T x)
Return an integer array containing the dimensions of x; the type of the argument T
5.3. ARRAY BROADCASTING 53
int num_elements(array[] T x)
Return the total number of elements in the array x including all elements in contained
arrays, vectors, and matrices. T can be any array type. For example, if x is of type
array[4, 3] real then num_elements(x) is 12, and if y is declared as array[5]
matrix[3, 4] y, then size(y) evaluates to 60.
Available since 2.5
int size(array[] T x)
Return the number of elements in the array x; the type of the array T can be any type,
but the size is just the size of the top level array, not the total number of elements
contained. For example, if x is of type array[4, 3] real then size(x) is 4.
Available since 2.0
For example, rep_array(1.0,5) produces a real array (type array[] real) of size
5 with all values set to 1.0. On the other hand, rep_array(1,5) produces an
integer array (type array[] int) of size 5 with all values set to 1. This distinction
is important because it is not possible to assign an integer array to a real array.
For example, the following example contrasts legal with illegal array creation and
assignment
array[5] real y;
array[5] int x;
54 CHAPTER 5. ARRAY OPERATIONS
x = rep_array(1, 5); // ok
y = rep_array(1.0, 5); // ok
x = y; // illegal
y = x; // illegal
If the value being repeated v is a vector (i.e., T is vector), then rep_array(v, 27)
is a size 27 array consisting of 27 copies of the vector v.
vector[5] v;
array[3] vector[5] a;
If the type T of x is itself an array type, then the result will be an array with one,
two, or three added dimensions, depending on which of the rep_array functions is
called. For instance, consider the following legal code snippet.
array[5, 6] real a;
array[3, 4, 5, 6] real b;
After the assignment to b, the value for b[j, k, m, n] is equal to a[m, n] where it
is defined, for j in 1:3, k in 1:4, m in 1:5, and n in 1:6.
For example, the following code appends two three dimensional arrays of matrices
together. Note that all dimensions except the first match. Any mismatches will cause
5.5. SORTING FUNCTIONS 55
an error to be thrown.
array[2, 1, 7] matrix[4, 6] x1;
array[3, 1, 7] matrix[4, 6] x2;
array[5, 1, 7] matrix[4, 6] x3;
x3 = append_array(x1, x2);
order.
Available since 2.3
then
reverse(v) = (20.987, −10.3, 1).
array[] T reverse(array[] T v)
Return a new array containing the elements of the argument in reverse order.
Available since 2.23
6. Matrix Operations
int num_elements(row_vector x)
The total number of elements in the vector x (same as function cols)
Available since 2.5
int num_elements(matrix x)
The total number of elements in the matrix x. For example, if x is a 5 × 3 matrix,
then num_elements(x) is 15
Available since 2.5
int rows(vector x)
The number of rows in the vector x
Available since 2.0
int rows(row_vector x)
The number of rows in the row vector x, namely 1
Available since 2.0
int rows(matrix x)
The number of rows in the matrix x
Available since 2.0
int cols(vector x)
The number of columns in the vector x, namely 1
Available since 2.0
int cols(row_vector x)
The number of columns in the row vector x
Available since 2.0
int cols(matrix x)
The number of columns in the matrix x
Available since 2.0
57
58 CHAPTER 6. MATRIX OPERATIONS
int size(vector x)
The size of x, i.e., the number of elements
Available since 2.26
int size(row_vector x)
The size of x, i.e., the number of elements
Available since 2.26
int size(matrix x)
The size of the matrix x. For example, if x is a 5 × 3 matrix, then size(x) is 15
Available since 2.26
row_vector operator-(row_vector x)
The negation of the row vector x.
Available since 2.0
matrix operator-(matrix x)
The negation of the matrix x.
Available since 2.0
T operator-(T x)
Vectorized version of operator-. If T x is a (possibly nested) array of matrix types,
-x is the same shape array where each individual value is negated.
Available since 2.31
row_vector operator'(vector x)
62 CHAPTER 6. MATRIX OPERATIONS
vector operator'(row_vector x)
The transpose of the row vector x, written as x'
Available since 2.0
real dot_self(vector x)
The dot product of the vector x with itself
Available since 2.0
real dot_self(row_vector x)
The dot product of the row vector x with itself
Available since 2.0
row_vector columns_dot_self(vector x)
The dot product of the columns of x with themselves
Available since 2.0
row_vector columns_dot_self(row_vector x)
The dot product of the columns of x with themselves
Available since 2.0
row_vector columns_dot_self(matrix x)
The dot product of the columns of x with themselves
Available since 2.0
vector rows_dot_self(vector x)
The dot product of the rows of x with themselves
Available since 2.0
vector rows_dot_self(row_vector x)
The dot product of the rows of x with themselves
Available since 2.0
vector rows_dot_self(matrix x)
The dot product of the rows of x with themselves
Available since 2.0
Specialized products
matrix tcrossprod(matrix x)
The product of x postmultiplied by its own transpose, similar to the tcrossprod(x)
66 CHAPTER 6. MATRIX OPERATIONS
matrix crossprod(matrix x)
The product of x premultiplied by its own transpose, similar to the crossprod(x)
function in R. The result is a symmetric matrix x⊤ x.
Available since 2.0
The following functions all provide shorthand forms for common expressions, which
are also much more efficient.
matrix quad_form(matrix A, matrix B)
The quadratic form, i.e., B' * A * B.
Available since 2.0
matrix multiply_lower_tri_self_transpose(matrix x)
The product of the lower triangular portion of x (including the diagonal) times its
own transpose; that is, if L is a matrix of the same dimensions as x with L(m,n) equal
to x(m,n) for n ≤ m and L(m,n) equal to 0 if n > m, the result is the symmetric
matrix L L⊤ . This is a specialization of tcrossprod(x) for lower-triangular matrices.
The input matrix does not need to be square.
Available since 2.0
6.6. Reductions
Log sum of exponents
real log_sum_exp(vector x)
The natural logarithm of the sum of the exponentials of the elements in x
Available since 2.0
real log_sum_exp(row_vector x)
The natural logarithm of the sum of the exponentials of the elements in x
Available since 2.0
real log_sum_exp(matrix x)
68 CHAPTER 6. MATRIX OPERATIONS
real min(row_vector x)
The minimum value in x, or +∞ if x is empty
Available since 2.0
real min(matrix x)
The minimum value in x, or +∞ if x is empty
Available since 2.0
real max(vector x)
The maximum value in x, or −∞ if x is empty
Available since 2.0
real max(row_vector x)
The maximum value in x, or −∞ if x is empty
Available since 2.0
real max(matrix x)
The maximum value in x, or −∞ if x is empty
Available since 2.0
real sum(row_vector x)
The sum of the values in x, or 0 if x is empty
Available since 2.0
real sum(matrix x)
The sum of the values in x, or 0 if x is empty
Available since 2.0
real prod(vector x)
The product of the values in x, or 1 if x is empty
6.6. REDUCTIONS 69
real prod(row_vector x)
The product of the values in x, or 1 if x is empty
Available since 2.0
real prod(matrix x)
The product of the values in x, or 1 if x is empty
Available since 2.0
Sample moments
Full definitions are provided for sample moments in section array reductions.
real mean(vector x)
The sample mean of the values in x; see section array reductions for details.
Available since 2.0
real mean(row_vector x)
The sample mean of the values in x; see section array reductions for details.
Available since 2.0
real mean(matrix x)
The sample mean of the values in x; see section array reductions for details.
Available since 2.0
real variance(vector x)
The sample variance of the values in x; see section array reductions for details.
Available since 2.0
real variance(row_vector x)
The sample variance of the values in x; see section array reductions for details.
Available since 2.0
real variance(matrix x)
The sample variance of the values in x; see section array reductions for details.
Available since 2.0
real sd(vector x)
The sample standard deviation of the values in x; see section array reductions for
details.
Available since 2.0
real sd(row_vector x)
The sample standard deviation of the values in x; see section array reductions for
70 CHAPTER 6. MATRIX OPERATIONS
details.
Available since 2.0
real sd(matrix x)
The sample standard deviation of the values in x; see section array reductions for
details.
Available since 2.0
Quantile
Produces sample quantiles corresponding to the given probabilities. The smallest
observation corresponds to a probability of 0 and the largest to a probability of 1.
Implements algorithm 7 from Hyndman, R. J. and Fan, Y., Sample quantiles in
Statistical Packages (R’s default quantile function).
real quantile(data vector x, data real p)
The p-th quantile of x
Available since 2.27
Unlike the situation with array broadcasting (see section array broadcasting), where
there is a distinction between integer and real arguments, the following two state-
ments produce the same result for vector broadcasting; row vector and matrix
broadcasting behave similarly.
vector[3] x;
x = rep_vector(1, 3);
x = rep_vector(1.0, 3);
There are no integer vector or matrix types, so integer values are automatically
promoted.
Symmetrization
matrix symmetrize_from_lower_tri(matrix A)
vector diagonal(matrix x)
The diagonal of the matrix x
Available since 2.0
matrix diag_matrix(vector x)
The diagonal matrix with diagonal x
Available since 2.0
it is much more efficient to just use a univariate normal, which produces the same
density,
y ~ normal(mu, sigma);
is true.
Available since 2.26
vector ones_vector(int n)
Create an n-dimensional vector of all ones
Available since 2.26
row_vector ones_row_vector(int n)
Create an n-dimensional row-vector of all ones
74 CHAPTER 6. MATRIX OPERATIONS
vector zeros_vector(int n)
Create an n-dimensional vector of all zeros
Available since 2.24
row_vector zeros_row_vector(int n)
Create an n-dimensional row-vector of all zeros
Available since 2.24
vector uniform_simplex(int n)
Create an n-dimensional simplex with elements vector[i] = 1 / n for all i ∈
1, . . . , n
Available since 2.24
Block operations
Matrix slicing operations
Block operations may be used to extract a sub-block of a matrix.
matrix block(matrix x, int i, int j, int n_rows, int n_cols)
Return the submatrix of x that starts at row i and column j and extends n_rows rows
and n_cols columns.
Available since 2.0
The sub-row and sub-column operations may be used to extract a slice of row or
column from a matrix
vector sub_col(matrix x, int i, int j, int n_rows)
Return the sub-column of x that starts at row i and column j and extends n_rows
rows and 1 column.
Available since 2.0
Vertical concatenation
matrix append_row(matrix x, matrix y)
Combine matrices x and y by row. The matrices must have the same number of
columns.
Available since 2.5
exp(y)
softmax(y) = PK ,
k=1 exp(yk )
= y − log_sum_exp(y).
where the vector y minus the scalar log_sum_exp(y) subtracts the scalar from each
component of y.
Stan provides the following functions for softmax and its log.
vector softmax(vector x)
The softmax of x
Available since 2.0
vector log_softmax(vector x)
The natural logarithm of the softmax of x
Available since 2.0
1 The softmax function is so called because in the limit as y → ∞ with y
n m for m ̸= n held constant,
the result tends toward the “one-hot” vector θ with θn = 1 and θm = 0 for m ̸= n, thus providing a “soft”
version of the maximum function.
6.13. GAUSSIAN PROCESS COVARIANCE FUNCTIONS 79
Cumulative sums
The cumulative sum of a sequence x1 , . . . , xN is the sequence y1 , . . . , yN , where
n
X
yn = xm .
m=1
vector cumulative_sum(vector v)
The cumulative sum of v
Available since 2.0
|xi − xj |2
k(xi , xj ) = σ 2 exp −
2l2
80 CHAPTER 6. MATRIX OPERATIONS
Exponential kernel
With magnitude σ and length scale l, the exponential kernel is:
2 |xi − xj |
k(xi , xj ) = σ exp −
l
√ √
2 3|xi − xj | 3|xi − xj |
k(xi , xj ) = σ 1 + exp −
l l
6.13. GAUSSIAN PROCESS COVARIANCE FUNCTIONS 83
Gaussian process covariance with Matern 3/2 kernel in multiple dimensions with a
length scale for each dimension.
Available since 2.20
√ √
5|xi − xj | 5|xi − xj |2
5|xi − xj |
k(xi , xj ) = σ 2 1 + + exp −
l 3l2 l
Gaussian process covariance with Matern 5/2 kernel in multiple dimensions with a
length scale for each dimension.
Available since 2.20
Periodic kernel
With magnitude σ, length scale l, and period p, the periodic kernel is:
|x −x |
2 sin2 π i p j
k(xi , xj ) = σ 2 exp −
l2
When a lower triangular view of a matrix is used, the elements above the diagonal
are ignored.
vector mdivide_left_tri_low(matrix A, vector b)
The left division of b by a lower-triangular view of A; algebraically equivalent to
the less efficient and stable form inverse(tri(A)) * b, where tri(A) is the lower-
triangular portion of A with the above-diagonal entries set to zero.
Available since 2.12
Matrix exponential
The exponential of the matrix A is formally defined by the convergent power series:
∞
X An
eA =
n=0
n!
matrix matrix_exp(matrix A)
The matrix exponential of A
Available since 2.13
Matrix power
Returns the nth power of the specific matrix:
M n = M1 ∗ ... ∗ Mn
Determinants
real determinant(matrix A)
The determinant of A
Available since 2.0
real log_determinant(matrix A)
6.14. LINEAR ALGEBRA FUNCTIONS AND SOLVERS 89
real log_determinant_spd(matrix A)
The log of the absolute value of the determinant of the symmetric, positive-definite
matrix A.
Available since 2.30
Inverses
It is almost never a good idea to use matrix inverses directly because they are both
inefficient and arithmetically unstable compared to the alternatives. Rather than
inverting a matrix m and post-multiplying by a vector or matrix a, as in inverse(m)
* a, it is better to code this using matrix division, as in m \ a. The pre-multiplication
case is similar, with b * inverse(m) being more efficiently coded as as b / m. There
are also useful special cases for triangular and symmetric, positive-definite matrices
that use more efficient solvers.
Warning: The function inv(m) is the elementwise inverse function, which returns 1
/ m[i, j] for each element.
matrix inverse(matrix A)
Compute the inverse of A
Available since 2.0
matrix inverse_spd(matrix A)
Compute the inverse of A where A is symmetric, positive definite. This version
is faster and more arithmetically stable when the input is symmetric and positive
definite.
Available since 2.0
matrix chol2inv(matrix L)
Compute the inverse of the matrix whose cholesky factorization is L. That is, for
A = LLT , return A−1 .
Available since 2.26
Generalized Inverse
The generalized inverse M + of a matrix M is a matrix that satisfies M M + M = M .
For an invertible, square matrix M , M + is equivalent to M −1 . The dimensions of
M + are equivalent to the dimensions of M T . The generalized inverse exists for any
matrix, so the M may be singular or less than full rank.
Even though the generalized inverse exists for any arbitrary matrix, the derivatives
90 CHAPTER 6. MATRIX OPERATIONS
of this function only exist on matrices of locally constant rank (Golub and Pereyra
1973), meaning, the derivatives do not exist if small perturbations make the matrix
change rank. For example, considered the rank of the matrix A as a function of ϵ:
1+ϵ 2 1
A=
2 4 2
When ϵ = 0, A is rank 1 because the second row is twice the first (and so there
is only one linearly independent row). If ϵ ̸= 0, the rows are no longer linearly
dependent, and the matrix is rank 2. This matrix does not have locally constant rank
at ϵ = 0, and so the derivatives do not exist at zero. Because HMC depends on the
derivatives existing, this lack of differentiability creates undefined behavior.
matrix generalized_inverse(matrix A)
The generalized inverse of A
Available since 2.26
Eigendecomposition
complex_vector eigenvalues(matrix A)
The complex-valued vector of eigenvalues of the matrix A. The eigenvalues are
repeated according to their algebraic multiplicity, so there are as many eigenvalues
as rows in the matrix. The eigenvalues are not sorted in any particular order.
Available since 2.30
complex_matrix eigenvectors(matrix A)
The matrix with the complex-valued (column) eigenvectors of the matrix A in the
same order as returned by the function eigenvalues
Available since 2.30
vector eigenvalues_sym(matrix A)
The vector of eigenvalues of a symmetric matrix A in ascending order
Available since 2.0
matrix eigenvectors_sym(matrix A)
The matrix with the (column) eigenvectors of symmetric matrix A in the same order
6.14. LINEAR ALGEBRA FUNCTIONS AND SOLVERS 91
matrix qr_thin_R(matrix A)
The upper triangular matrix in the thin QR decomposition of A, which implies that
the resulting matrix is square with the same number of columns as A
Available since 2.18
matrix qr_Q(matrix A)
The orthogonal matrix in the fat QR decomposition of A, which implies that the
92 CHAPTER 6. MATRIX OPERATIONS
matrix qr_R(matrix A)
The upper trapezoidal matrix in the fat QR decomposition of A, which implies that
the resulting matrix will be rectangular with the same dimensions as A
Available since 2.3
The thin QR decomposition is always preferable because it will consume much less
memory when the input matrix is large than will the fat QR decomposition. Both
versions of the decomposition represent the input matrix as
A = Q R.
Σ = L L⊤ .
matrix cholesky_decompose(matrix A)
The lower-triangular Cholesky factor of the symmetric positive-definite matrix A
Available since 2.0
6.15. SORT FUNCTIONS 93
A = U DV T .
The matrices of singular vectors here are thin. That is for an N by P input A,
M = min(N, P ), U is size N by M and V is size P by M .
vector singular_values(matrix A)
The singular values of A in descending order
Available since 2.0
matrix svd_U(matrix A)
The left-singular vectors of A
Available since 2.26
matrix svd_V(matrix A)
The right-singular vectors of A
Available since 2.26
row_vector sort_asc(row_vector v)
Sort the elements of v in ascending order
Available since 2.0
vector sort_desc(vector v)
Sort the elements of v in descending order
Available since 2.0
94 CHAPTER 6. MATRIX OPERATIONS
row_vector sort_desc(row_vector v)
Sort the elements of v in descending order
Available since 2.0
row_vector reverse(row_vector v)
Return a new row vector containing the elements of the argument in reverse order.
Available since 2.23
7. Complex Matrix Operations
95
96 CHAPTER 7. COMPLEX MATRIX OPERATIONS
int num_elements(complex_row_vector x)
The total number of elements in the vector x (same as function cols)
Available since 2.30
int num_elements(complex_matrix x)
The total number of elements in the matrix x. For example, if x is a 5 × 3 matrix,
then num_elements(x) is 15
Available since 2.30
int rows(complex_vector x)
The number of rows in the vector x
Available since 2.30
int rows(complex_row_vector x)
The number of rows in the row vector x, namely 1
Available since 2.30
int rows(complex_matrix x)
The number of rows in the matrix x
Available since 2.30
int cols(complex_vector x)
The number of columns in the vector x, namely 1
Available since 2.30
int cols(complex_row_vector x)
The number of columns in the row vector x
7.3. COMPLEX MATRIX ARITHMETIC OPERATORS 97
int cols(complex_matrix x)
The number of columns in the matrix x
Available since 2.30
int size(complex_vector x)
The size of x, i.e., the number of elements
Available since 2.30
int size(complex_row_vector x)
The size of x, i.e., the number of elements
Available since 2.30
int size(matrix x)
The size of the matrix x. For example, if x is a 5 × 3 matrix, then size(x) is 15.
Available since 2.30
complex_row_vector operator-(complex_row_vector x)
The negation of the row vector x.
Available since 2.30
complex_matrix operator-(complex_matrix x)
The negation of the matrix x.
Available since 2.30
T operator-(T x)
Vectorized version of operator-. If T x is a (possibly nested) array of matrix types,
-x is the same shape array where each individual value is negated.
Available since 2.31
98 CHAPTER 7. COMPLEX MATRIX OPERATIONS
complex_row_vector operator'(complex_vector x)
The transpose of the vector x, written as x'
Available since 2.30
complex_vector operator'(complex_row_vector x)
The transpose of the row vector x, written as x'
Available since 2.30
complex_row_vector operator.*(complex_row_vector x,
complex_row_vector y)
The elementwise product of x and y
Available since 2.30
complex_row_vector operator./(complex_row_vector x,
complex_row_vector y)
The elementwise quotient of x and y
Available since 2.30
complex_row_vector columns_dot_product(complex_vector x,
complex_vector y)
The dot product of the columns of x and y
Available since 2.30
complex_row_vector columns_dot_product(complex_row_vector x,
complex_row_vector y)
The dot product of the columns of x and y
Available since 2.30
104 CHAPTER 7. COMPLEX MATRIX OPERATIONS
complex_row_vector columns_dot_product(complex_matrix x,
complex_matrix y)
The dot product of the columns of x and y
Available since 2.30
complex_vector rows_dot_product(complex_row_vector x,
complex_row_vector y)
The dot product of the rows of x and y
Available since 2.30
complex dot_self(complex_vector x)
The dot product of the vector x with itself
Available since 2.30
complex dot_self(complex_row_vector x)
The dot product of the row vector x with itself
Available since 2.30
complex_row_vector columns_dot_self(complex_vector x)
The dot product of the columns of x with themselves
Available since 2.30
complex_row_vector columns_dot_self(complex_row_vector x)
The dot product of the columns of x with themselves
Available since 2.30
complex_row_vector columns_dot_self(complex_matrix x)
The dot product of the columns of x with themselves
Available since 2.30
complex_vector rows_dot_self(complex_vector x)
The dot product of the rows of x with themselves
Available since 2.30
complex_vector rows_dot_self(complex_row_vector x)
The dot product of the rows of x with themselves
7.7. COMPLEX REDUCTIONS 105
complex_vector rows_dot_self(complex_matrix x)
The dot product of the rows of x with themselves
Available since 2.30
Specialized products
complex_matrix diag_pre_multiply(complex_vector v, complex_matrix m)
Return the product of the diagonal matrix formed from the vector v and the matrix
m, i.e., diag_matrix(v) * m.
Available since 2.30
complex_matrix diag_post_multiply(complex_matrix m,
complex_row_vector v)
Return the product of the matrix m and the diagonal matrix formed from the the row
vector rv, i.e., m * diag_matrix(rv).
Available since 2.30
complex sum(complex_row_vector x)
The sum of the values in x, or 0 if x is empty
Available since 2.30
complex sum(complex_matrix x)
The sum of the values in x, or 0 if x is empty
106 CHAPTER 7. COMPLEX MATRIX OPERATIONS
complex prod(complex_vector x)
The product of the values in x, or 1 if x is empty
Available since 2.30
complex prod(complex_row_vector x)
The product of the values in x, or 1 if x is empty
Available since 2.30
complex prod(complex_matrix x)
The product of the values in x, or 1 if x is empty
Available since 2.30
T_demoted get_imag(T x)
Given an object of complex type T, return the same shape object but of type real by
getting the imaginary component of each element of x.
Available since 2.30
A call get_real(z) will yield the vector [3, 5]', and a call get_imag(z) will yield
the vector [4, 6]'.
Symmetrization
complex_matrix symmetrize_from_lower_tri(complex_matrix A)
Construct a symmetric matrix from the lower triangle of A.
Available since 2.30
108 CHAPTER 7. COMPLEX MATRIX OPERATIONS
complex_vector diagonal(complex_matrix x)
The diagonal of the matrix x
Available since 2.30
complex_matrix diag_matrix(complex_vector x)
The diagonal matrix with diagonal x
Available since 2.30
Block operations
Matrix slicing operations
complex_matrix block(complex_matrix x, int i, int j, int n_rows, int
n_cols)
Return the submatrix of x that starts at row i and column j and extends n_rows rows
and n_cols columns.
Available since 2.30
7.12. COMPLEX MATRIX CONCATENATION 109
rows.
Available since 2.30
complex_row_vector append_col(complex_row_vector x,
complex_row_vector y)
Combine row vectors x and y (of any size) into another row vector by appending y
to the end of x.
Available since 2.30
Vertical concatenation
complex_matrix append_row(complex_matrix x, complex_matrix y)
Combine matrices x and y by row. The matrices must have the same number of
columns.
Available since 2.30
complex_matrix fft2(complex_matrix m)
Return the 2D discrete Fourier transform of the specified complex matrix m. The
2D FFT is defined as the result of applying the FFT to each row and then to each
column.
Available since 2.30
complex_vector inv_fft(complex_vector u)
Return the inverse of the discrete Fourier transform of the specified complex vector u.
The inverse FFT (this function) is scaled so that fft(inv_fft(u)) == u. If u ∈ CN
is a complex vector with N elements and v = fft−1 (u), then
√
1 X n · m · 2 · π · −1
vn = um · exp .
N m<n N
This only differs from the FFT by the sign inside the exponential and the scaling.
The N1 scaling ensures that fft(inv_fft(u)) == u and inv_fft(fft(v)) == v for
complex vectors u and v.
Available since 2.30
complex_matrix inv_fft2(complex_matrix m)
Return the inverse of the 2D discrete Fourier transform of the specified complex
matrix m. The 2D inverse FFT is defined as the result of applying the inverse FFT to
each row and then to each column. The invertible scaling of the inverse FFT ensures
fft2(inv_fft2(A)) == A and inv_fft2(fft2(B)) == B.
Available since 2.30
Cumulative sums
The cumulative sum of a sequence x1 , . . . , xN is the sequence y1 , . . . , yN , where
n
X
yn = xm .
m=1
complex_vector cumulative_sum(complex_vector v)
The cumulative sum of v
Available since 2.30
Eigendecomposition
complex_vector eigenvalues(complex_matrix A)
The complex-valued vector of eigenvalues of the matrix A. The eigenvalues are
repeated according to their algebraic multiplicity, so there are as many eigenvalues
as rows in the matrix. The eigenvalues are not sorted in any particular order.
Available since 2.32
complex_matrix eigenvectors(complex_matrix A)
The matrix with the complex-valued (column) eigenvectors of the matrix A in the
same order as returned by the function eigenvalues
Available since 2.32
A)
Return the matrix of (column) eigenvectors and vector of eigenvalues of the matrix
A. This function is equivalent to (eigenvectors(A), eigenvalues(A)) but with a
lower computational cost due to the shared work between the two results.
Available since 2.33
complex_vector eigenvalues_sym(complex_matrix A)
The vector of eigenvalues of a symmetric matrix A in ascending order
Available since 2.30
complex_matrix eigenvectors_sym(complex_matrix A)
The matrix with the (column) eigenvectors of symmetric matrix A in the same order
as returned by the function eigenvalues_sym
Available since 2.30
vector singular_values(complex_matrix A)
The singular values of A in descending order
Available since 2.30
complex_matrix svd_U(complex_matrix A)
The left-singular vectors of A
Available since 2.30
complex_matrix svd_V(complex_matrix A)
The right-singular vectors of A
Available since 2.30
A = U · T · U −1
complex_matrix complex_schur_decompose_t(complex_matrix A)
Compute the upper-triangular Schur form matrix of the complex Schur decomposi-
tion of A.
Available since 2.31
complex_matrix complex_schur_decompose_u(matrix A)
Compute the unitary matrix of the complex Schur decomposition of A.
Available since 2.31
116 CHAPTER 7. COMPLEX MATRIX OPERATIONS
complex_matrix complex_schur_decompose_u(complex_matrix A)
Compute the unitary matrix of the complex Schur decomposition of A.
Available since 2.31
complex_row_vector reverse(complex_row_vector v)
Return a new row vector containing the elements of the argument in reverse order.
Available since 2.30
8. Sparse Matrix Operations
For sparse matrices, for which many elements are zero, it is more efficient to
use specialized representations to save memory and speed up matrix arithmetic
(including derivative calculations). Given Stan’s implementation, there is substantial
space (memory) savings by using sparse matrices. Because of the ease of optimizing
dense matrix operations, speed improvements only arise at 90% or even greater
sparsity; below that level, dense matrices are faster but use more memory.
Because of this speedup and space savings, it may even be useful to read in a dense
matrix and convert it to a sparse matrix before multiplying it by a vector. This chapter
covers a very specific form of sparsity consisting of a sparse matrix multiplied by a
dense vector.
and an array of integer indices indicating where in w(A) a given row’s values start,
u(A) = 1 3 3 4 7 ,
u(A)[n + 1] − u(A)[n]
is the number of non-zero elements in row n of the matrix (here 2, 0, 1, and 3). Note
that because the second row has no non-zero elements both the second and third
117
118 CHAPTER 8. SPARSE MATRIX OPERATIONS
elements of u(A) correspond to the third element of w(A), which is 52. The values
(w(A), v(A), u(A)) are sufficient to reconstruct A.
The values are structured so that there is a real value and integer column index for
each non-zero entry in the array, plus one integer for each row of the matrix, plus
one for padding. There is also underlying storage for internal container pointers
and sizes. The total memory usage is roughly 12K + M bytes plus a small constant
overhead, which is often considerably fewer bytes than the M × N required to store
a dense matrix. Even more importantly, zero values do not introduce derivatives
under multiplication or addition, so many storage and evaluation steps are saved
when sparse matrices are multiplied.
array[] int u)
Return dense m × n matrix with non-zero matrix entries w, column indices v, and row
starting indices u; the vector w and array v must be the same size (corresponding to
the total number of nonzero entries in the matrix), array v must have index values
bounded by m, array u must have length equal to m + 1 and contain index values
bounded by the number of nonzeros (except for the last entry, which must be equal
to the number of nonzeros plus one). See section compressed row storage for more
details.
Available since 2.10
b A = (A⊤ b⊤ )⊤ ,
complex_matrix to_matrix(complex_matrix m)
Return the matrix m itself.
Available since 2.30
matrix to_matrix(vector v)
Convert the column vector v to a size(v) by 1 matrix.
Available since 2.3
complex_matrix to_matrix(complex_vector v)
Convert the column vector v to a size(v) by 1 matrix.
Available since 2.30
matrix to_matrix(row_vector v)
Convert the row vector v to a 1 by size(v) matrix.
Available since 2.3
complex_matrix to_matrix(complex_row_vector v)
Convert the row vector v to a 1 by size(v) matrix.
Available since 2.30
120
121
vector to_vector(matrix m)
Convert the matrix m to a column vector in column-major order.
Available since 2.0
complex_vector to_vector(complex_matrix m)
Convert the matrix m to a column vector in column-major order.
Available since 2.30
vector to_vector(vector v)
Return the column vector v itself.
Available since 2.3
complex_vector to_vector(complex_vector v)
Return the column vector v itself.
Available since 2.30
124 CHAPTER 9. MIXED OPERATIONS
vector to_vector(row_vector v)
Convert the row vector v to a column vector.
Available since 2.3
complex_vector to_vector(complex_row_vector v)
Convert the row vector v to a column vector.
Available since 2.30
row_vector to_row_vector(matrix m)
Convert the matrix m to a row vector in column-major order.
Available since 2.3
complex_row_vector to_row_vector(complex_matrix m)
Convert the matrix m to a row vector in column-major order.
Available since 2.30
row_vector to_row_vector(vector v)
Convert the column vector v to a row vector.
Available since 2.3
complex_row_vector to_row_vector(complex_vector v)
Convert the column vector v to a row vector.
Available since 2.30
row_vector to_row_vector(row_vector v)
Return the row vector v itself.
Available since 2.3
complex_row_vector to_row_vector(complex_row_vector v)
Return the row vector v itself.
Available since 2.30
125
127
128 CHAPTER 10. COMPOUND ARITHMETIC AND ASSIGNMENT
There is no type restriction for the variadic arguments and each argument can be
passed as data or parameter. However users should use parameter arguments only
when necessary and mark data arguments with the keyword data. In the below
example, the last variadic argument, x, is restricted to being data:
vector algebra_system (vector y, vector theta, data vector x)
129
130 CHAPTER 11. HIGHER-ORDER FUNCTIONS
• rel_tol : for the Powell solver only, the relative tolerance, type real, data only.
The relative tolerance is the estimated relative error of the solver and serves to
test if a satisfactory solution has been found. Default value is 10−10 .
• function_tol : function tolerance for the algebraic solver, type real, data
only. After convergence of the solver, the proposed solution is plugged into
the algebraic system and its norm is compared to the function tolerance. If
the norm is below the function tolerance, the solution is deemed acceptable.
Default value is 10−6 .
• max_num_steps : maximum number of steps to take in the algebraic solver,
type int, data only. If the solver reaches this number of steps, it breaks and
returns an error message. Default value is 200.
The difference in which control parameters are available has to do with the under-
lying implementations for the solvers and the control parameters these implemen-
tations support. The Newton solver is based on KINSOL from the SUNDIAL suites,
while the Powell solver uses a module from the Eigen library.
Return value
The return value for the algebraic solver is an object of type vector, with values
which, when plugged in as y make the algebraic function go to 0 (approximately,
within the specified function tolerance).
Sizes and parallel arrays
Certain sizes have to be consistent. The initial guess, return value of the solver, and
return value of the algebraic function must all be the same size.
Algorithmic details
Stan offers two methods to solve algebraic equations. solve_newton and
solve_newton_tol use the Newton method, a first-order derivative based numerical
solver. The Stan code builds on the implementation in KINSOL from the SUNDIALS
suite (Hindmarsh et al. 2005). For many problems, we find that the Newton method
is faster than the Powell method. If however Newton’s method performs poorly,
either failing to or requiring an excessively long time to converge, the user should
be prepared to switch to the Powell method.
solve_powell and solve_powell_tol are based on the Powell hybrid method (Pow-
ell 1970), which also uses first-order derivatives. The Stan code builds on the imple-
mentation of the hybrid solver in the unsupported module for nonlinear optimization
problems of the Eigen library (Guennebaud, Jacob, et al. 2010). This solver is in
turn based on the algorithm developed for the package MINPACK-1 (Jorge J. More
132 CHAPTER 11. HIGHER-ORDER FUNCTIONS
1980).
For both solvers, derivatives are propagated through the solution to the algebraic
solution using the implicit function theorem and an adjoint method of automatic
differentiation; for a discussion on this topic, see Margossian and Betancourt (2022).
Stiff solver
array[] vector ode_bdf(function ode, vector initial_state, real
initial_time, array[] real times, ...)
Solves the ODE system for the times provided using the backward differentiation
formula (BDF) method.
Available since 2.24
Adjoint solver
array[] vector ode_adjoint_tol_ctl(function ode, vector
initial_state, real initial_time, array[] real times, data
real rel_tol_forward, data vector abs_tol_forward, data real
134 CHAPTER 11. HIGHER-ORDER FUNCTIONS
Solves the ODE system for the times provided using the adjoint ODE solver method
from CVODES. The adjoint ODE solver requires a checkpointed forward in time ODE
integration, a backwards in time integration that makes uses of an interpolated ver-
sion of the forward solution, and the solution of a quadrature problem (the number
of which depends on the number of parameters passed to the solve). The tolerances
and numeric methods used for the forward solve, backward solve, quadratures, and
interpolation can all be configured.
Available since 2.27
The ODE system function should return the derivative of the state with respect to
time at the time and state provided. The length of the returned vector must match
the length of the state input into the function.
The arguments to this function are:
• time, the time to evaluate the ODE system
• state, the state of the ODE system at the time specified
• ..., sequence of arguments passed unmodified from the ODE solve function
call. The types here must match the types in the ... arguments of the ODE
solve function call.
Arguments to the ODE solvers
The arguments to the ODE solvers in both the stiff and non-stiff solvers are the same.
The arguments to the adjoint ODE solver are different; see Arguments to the adjoint
ODE solvers.
• ode : ODE system function,
• initial_state : initial state, type vector,
11.2. ORDINARY DIFFERENTIAL EQUATION (ODE) SOLVERS 135
The DAE residual function should return the residuals at the time and state provided.
The length of the returned vector must match the length of the state input into the
function.
The arguments to this function are:
• time, the time to evaluate the DAE system
• state, the state of the DAE system at the time specified
• state_derivative, the time derivatives of the state of the DAE system at the
time specified
• ..., sequence of arguments passed unmodified from the DAE solve function
call. The types here must match the types in the ... arguments of the DAE
solve function call.
Arguments to the DAE solver
The arguments to the DAE solver are
• residual : DAE residual function,
• initial_state : initial state, type vector,
• initial_state_derivative : time derivative of the initial state, type vector,
• initial_time : initial time, type data real,
• times : solution times, type data array[] real,
• ... : sequence of arguments that will be passed through unmodified to the DAE
residual function. The types here must match the types in the ... arguments
of the DAE residual function.
For dae_tol, the following three parameters must be provided after times and
before the ... arguments:
• data rel_tol : relative tolerance for the DAE solver, type real, data only,
• data abs_tol : absolute tolerance for the DAE solver, type real, data only,
and
• max_num_steps : maximum number of steps to take between output times in
the DAE solver, type int, data only.
Because the tolerances are data arguments, they must be supplied as primitive
numerics or defined in either the data or transformed data blocks. They cannot
11.4. 1D INTEGRATOR 139
11.4. 1D integrator
Stan provides a built-in mechanism to perform 1D integration of a function via
quadrature methods.
It operates similarly to the algebraic solver and the ordinary differential equations
solver in that it allows as an argument a function.
Like both of those utilities, some of the arguments are limited to data only expres-
sions. These expressions must not contain variables other than those declared in the
data or transformed data blocks.
Specifying an integrand as a function
Performing a 1D integration requires the integrand to be specified somehow. This is
done by defining a function in the Stan functions block with the special signature:
real integrand(real x, real xc, array[] real theta,
array[] real x_r, array[] int x_i)
The function should return the value of the integrand evaluated at the point x.
The argument of this function are:
140 CHAPTER 11. HIGHER-ORDER FUNCTIONS
For efficiency reasons the reduce function doesn’t work with the element-wise
evaluated function g itself, but instead works through evaluating partial sums, f:
array[] U -> real, where:
f({ x1 }) = g(x1)
f({ x1, x2 }) = g(x1) + g(x2)
f({ x1, x2, ... }) = g(x1) + g(x2) + ...
Mathematically the summation reduction is associative and forming arbitrary partial
sums in an arbitrary order will not change the result. However, floating point
numerics on computers only have a limited precision such that associativity does
not hold exactly. This implies that the order of summation determines the exact
numerical result. For this reason, the higher-order reduce function is available in
two variants:
• reduce_sum: Automatically choose partial sums partitioning based on a dy-
namic scheduling algorithm.
• reduce_sum_static: Compute the same sum as reduce_sum, but partition the
input in the same way for given data set (in reduce_sum this partitioning might
change depending on computer load). This should result in stable numerical
evaluations.
Specifying the reduce-sum function
The higher-order reduce function takes a partial sum function f, an array argument
x (with one array element for each term in the sum), a recommended grainsize,
and a set of shared arguments. This representation allows parallelization of the
resultant sum.
real reduce_sum(F f, array[] T x, int grainsize, T1 s1, T2 s2, ...)
real reduce_sum_static(F f, array[] T x, int grainsize, T1 s1, T2 s2,
...)
Returns the equivalent of f(x, 1, size(x), s1, s2, ...), but computes the
result in parallel by breaking the array x into independent partial sums. s1, s2,
... are shared between all terms in the sum.
Available since 2.23
The map function returns the sequence of results for the particular shard being
evaluated. The arguments to the mapped function are:
• phi, the sequence of parameters shared across shards
• theta, the sequence of parameters specific to this shard
• x_r, sequence of real-valued data
• x_i, sequence of integer data
All input for the mapped function must be packed into these sequences and all output
from the mapped function must be packed into a single vector. The vector of output
from each mapped function is concatenated into the final result.
Rectangular map
The rectangular map function operates on rectangular (not ragged) data structures,
with parallel data structures for job-specific parameters, job-specific real data, and
job-specific integer data.
vector map_rect(F f, vector phi, array[] vector theta, data array[,]
real x_r, data array[,] int x_i)
Return the concatenation of the results of applying the function f, of
type (vector, vector, array[] real, array[] int):vector elementwise, i.e.,
f(phi, theta[n], x_r[n], x_i[n]) for each n in 1:N, where N is the size of the
parallel arrays of job-specific/local parameters theta, real data x_r, and integer
data x_r. The shared/global parameters phi are passed to each invocation of f.
Available since 2.18
12. Deprecated Functions
This appendix lists currently deprecated functionality along with how to replace it.
Starting in Stan 2.29, deprecated functions with drop in replacements (such as
the renaming of get_lp or multiply_log) will be removed 3 versions later e.g.,
functions deprecated in Stan 2.20 will be removed in Stan 2.23 and placed in
Removed Functions. The Stan compiler can automatically update these on the behalf
of the user for the entire deprecation window and at least one version following the
removal.
1/2 = 0
1.0/2.0 = 0.5
145
146 CHAPTER 12. DEPRECATED FUNCTIONS
The ODE system function should return the derivative of the state with respect to
time at the time provided. The length of the returned real array must match the
length of the state input into the function.
The arguments to this function are:
• time, the time to evaluate the ODE system
• state, the state of the ODE system at the time specified
• theta, parameter values used to evaluate the ODE system
• x_r, data values used to evaluate the ODE system
• x_i, integer data values used to evaluate the ODE system.
The ODE system function separates parameter values, theta, from data values, x_r,
for efficiency in computing the gradients of the ODE.
Non-stiff solver
array[,] real integrate_ode_rk45(function ode, array[] real
initial_state, real initial_time, array[] real times, array[] real
theta, array[] real x_r, array[] int x_i)
Solves the ODE system for the times provided using the Dormand-Prince algorithm,
a 4th/5th order Runge-Kutta method.
Available since 2.10, deprecated in 2.24
Stiff solver
array[,] real integrate_ode_bdf(function ode, array[] real
initial_state, real initial_time, array[] real times, array[] real
theta, data array[] real x_r, data array[] int x_i)
Solves the ODE system for the times provided using the backward differentiation
formula (BDF) method.
Available since 2.10, deprecated in 2.24
(real, array[] real, array[] real, data array[] real, data array[] int):arra
The arguments represent (1) time, (2) system state, (3) parameters, (4) real data,
and (5) integer data, and the return value contains the derivatives with respect to
time of the state,
• initial_state : initial state, type array[] real,
• initial_time : initial time, type int or real,
• times : solution times, type array[] real,
• theta : parameters, type array[] real,
• data x_r : real data, type array[] real, data only, and
• data x_i : integer data, type array[] int, data only.
For more fine-grained control of the ODE solvers, these parameters can also be
provided:
• data rel_tol : relative tolerance for the ODE solver, type real, data only,
• data abs_tol : absolute tolerance for the ODE solver, type real, data only,
and
• data max_num_steps : maximum number of steps to take in the ODE solver,
type int, data only.
Return values
The return value for the ODE solvers is an array of type array[,] real, with values
consisting of solutions at the specified times.
Sizes and parallel arrays
The sizes must match, and in particular, the following groups are of the same size:
• state variables passed into the system function, derivatives returned by the
system function, initial state passed into the solver, and rows of the return
value of the solver,
• solution times and number of rows of the return value of the solver,
• parameters, real data and integer data passed to the solver will be passed to
the system function
12.4. ALGEBRA_SOLVER, ALGEBRA_SOLVER_NEWTON ALGEBRAIC SOLVERS149
The algebraic system function should return the value of the algebraic function
which goes to 0, when we plug in the solution to the algebraic system.
The argument of this function are:
• y, the unknowns we wish to solve for
• theta, parameter values used to evaluate the algebraic system
• x_r, data values used to evaluate the algebraic system
• x_i, integer data used to evaluate the algebraic system
The algebraic system function separates parameter values, theta, from data values,
x_r, for efficiency in propagating the derivatives through the algebraic system.
Call to the algebraic solver
vector algebra_solver(function algebra_system, vector y_guess, vector
theta, data array[] real x_r, array[] int x_i)
Solves the algebraic system, given an initial guess, using the Powell hybrid algorithm.
Available since 2.17, deprecated in 2.31
Solves the algebraic system, given an initial guess, using Newton’s method.
Available since 2.24, deprecated in 2.31
D
!
2 1 X
k(xi , xj ) = α exp − 2 (xi,d − xj,d )2
2ρ
d=1
152
13.4. EXPONENTIATED QUADRATIC COVARIANCE FUNCTIONS 153
For example, normal_lpdf is the log of the normal probability density function (pdf)
and bernoulli_lpmf is the log of the bernoulli probability mass function (pmf). The
log of the corresponding cumulative distribution functions (cdf) use the same suffix,
normal_lcdf and bernoulli_lcdf.
provides the same (proportional) contribution to the model log density as the explicit
target density increment,
154
14.4. FINITE INPUTS 155
In both cases, the effect is to add terms to the target log density. The only difference
is that the example with the sampling (~) notation drops all additive constants
in the log density; the constants are not necessary for any of Stan’s sampling,
approximation, or optimization algorithms.
There are also log forms of the CDF and CCDF for most univariate distributions. For
example, normal_lcdf(y | mu, sigma) is defined by
Z y
log Normal(y | µ, σ) dy
−∞
14.8. Vectorization
Stan’s univariate log probability functions, including the log density functions, log
mass functions, log CDFs, and log CCDFs, all support vectorized function application,
with results defined to be the sum of the elementwise application of the function.
Some of the PRNG functions support vectorization, see section vectorized PRNG
functions for more details.
In all cases, matrix operations are at least as fast and usually faster than loops and
vectorized log probability functions are faster than their equivalent form defined with
14.8. VECTORIZATION 157
loops. This isn’t because loops are slow in Stan, but because more efficient automatic
differentiation can be used. The efficiency comes from the fact that a vectorized log
probability function only introduces one new node into the expression graph, thus
reducing the number of virtual function calls required to compute gradients in C++,
as well as from allowing caching of repeated computations.
Stan also overloads the multivariate normal distribution, including the Cholesky-
factor form, allowing arrays of row vectors or vectors for the variate and location
parameter. This is a huge savings in speed because the work required to solve the
linear system for the covariance matrix is only done once.
Stan also overloads some scalar functions, such as log and exp, to apply to vectors
(arrays) and return vectors (arrays). These vectorizations are defined elementwise
and unlike the probability functions, provide only minimal efficiency speedups over
repeated application and assignment in a loop.
Vectorized function signatures
Vectorized scalar arguments
The normal probability function is specified with the signature
normal_lpdf(reals | reals, reals);
The pseudotype reals is used to indicate that an argument position may be vec-
torized. Argument positions declared as reals may be filled with a real, a one-
dimensional array, a vector, or a row-vector. If there is more than one array or vector
argument, their types can be anything but their size must match. For instance, it is
legal to use normal_lpdf(row_vector | vector, real) as long as the vector and
row vector have the same size.
Vectorized vector and row vector arguments
The multivariate normal distribution accepting vector or array of vector arguments
is written as
multi_normal_lpdf(vectors | vectors, matrix);
These arguments may be row vectors, column vectors, or arrays of row vectors or
column vectors.
Vectorized integer arguments
The pseudotype ints is used for vectorized integer arguments. Where it appears
either an integer or array of integers may be used.
158 CHAPTER 14. CONVENTIONS FOR PROBABILITY FUNCTIONS
Argument types
In the case of PRNG functions, arguments marked ints may be integers or integer
arrays, whereas arguments marked reals may be integers or reals, integer or real
arrays, vectors, or row vectors.
Dimension matching
In general, if there are multiple non-scalar arguments, they must all have the same
dimensions, but need not have the same type. For example, the normal_rng function
may be called with one vector argument and one real array argument as long as they
have the same number of elements.
vector[3] mu = // ...
array[3] real sigma = // ...
array[3] real x = normal_rng(mu, sigma);
Return type
The result of a vectorized PRNG function depends on the size of the arguments and
the distribution’s support. If all arguments are scalars, then the return type is a scalar.
For a continuous distribution, if there are any non-scalar arguments, the return type
is a real array (array[] real) matching the size of any of the non-scalar arguments,
as all non-scalar arguments must have matching size. Discrete distributions return
ints and continuous distributions return reals, each of appropriate size. The
symbol R denotes such a return type.
Discrete Distributions
160
15. Binary Distributions
Binary probability distributions have support on {0, 1}, where 1 represents the value
true and 0 the value false.
Sampling statement
y ~ bernoulli(theta)
Increment target log probability density with bernoulli_lupmf(y | theta).
Available since 2.0
Stan Functions
real bernoulli_lpmf(ints y | reals theta)
The log Bernoulli probability mass of y given chance of success theta
Available since 2.12
161
162 CHAPTER 15. BINARY DISTRIBUTIONS
R bernoulli_rng(reals theta)
Generate a Bernoulli variate with chance of success theta; may only be used in
transformed data and generated quantities blocks. For a description of argument
and return types, see section vectorized PRNG functions.
Available since 2.18
logit−1 (α)
if y = 1, and
BernoulliLogit(y | α) = Bernoulli(y|logit−1 (α)) =
1 − logit−1 (α) if y = 0.
Sampling statement
y ~ bernoulli_logit(alpha)
Increment target log probability density with bernoulli_logit_lupmf(y |
alpha).
Available since 2.0
Stan Functions
real bernoulli_logit_lpmf(ints y | reals alpha)
The log Bernoulli probability mass of y given chance of success inv_logit(alpha)
Available since 2.12
R bernoulli_logit_rng(reals alpha)
Generate a Bernoulli variate with chance of success logit−1 (α); may only be used in
15.3. BERNOULLI-LOGIT GENERALIZED LINEAR MODEL (LOGISTIC REGRESSION)163
Sampling statement
y ~ bernoulli_logit_glm(x, alpha, beta)
Increment target log probability density with bernoulli_logit_glm_lupmf(y | x,
alpha, beta).
Available since 2.25
Stan Functions
real bernoulli_logit_glm_lpmf(int y | matrix x, real alpha, vector
beta)
The log Bernoulli probability mass of y given chance of success inv_logit(alpha +
x * beta).
Available since 2.23
beta)
The log Bernoulli probability mass of y given chance of success inv_logit(alpha +
x * beta).
Available since 2.23
vector beta)
The log Bernoulli probability mass of y given chance of success inv_logit(alpha +
x * beta) dropping constant additive terms.
Available since 2.25
∂ n N −n
log Binomial(n | N, θ) = −
∂θ θ 1−θ
Sampling statement
n ~ binomial(N, theta)
Increment target log probability density with binomial_lupmf(n | N, theta).
Available since 2.0
Stan functions
real binomial_lpmf(ints n | ints N, reals theta)
The log binomial probability mass of n successes in N trials given chance of success
theta
Available since 2.12
166
16.2. BINOMIAL DISTRIBUTION, LOGIT PARAMETERIZATION 167
∂ n N −n
log BinomialLogit(n | N, α) = −
∂α logit−1 (−α) logit−1 (α)
Sampling statement
n ~ binomial_logit(N, alpha)
Increment target log probability density with binomial_logit_lupmf(n | N,
alpha).
Available since 2.0
Stan functions
real binomial_logit_lpmf(ints n | ints N, reals alpha)
The log binomial probability mass of n successes in N trials given logit-scaled chance
of success alpha
Available since 2.12
Γ(u) Γ(v)
B(u, v) = .
Γ(u + v)
Sampling statement
n ~ beta_binomial(N, alpha, beta)
Increment target log probability density with beta_binomial_lupmf(n | N,
alpha, beta).
Available since 2.0
16.3. BETA-BINOMIAL DISTRIBUTION 169
Stan functions
real beta_binomial_lpmf(ints n | ints N, reals alpha, reals beta)
The log beta-binomial probability mass of n successes in N trials given prior success
count (plus one) of alpha and prior failure count (plus one) of beta
Available since 2.12
Sampling statement
n ~ hypergeometric(N, a, b)
Increment target log probability density with hypergeometric_lupmf(n | N, a,
b).
Available since 2.0
Stan functions
real hypergeometric_lpmf(int n | int N, int a, int b)
The log hypergeometric probability mass of n successes in N trials given total success
count of a and total failure count of b
Available since 2.12
See the definition of softmax for the definition of the softmax function.
Sampling statement
y ~ categorical(theta)
Increment target log probability density with categorical_lupmf(y | theta) drop-
ping constant additive terms.
Available since 2.0
Sampling statement
y ~ categorical_logit(beta)
Increment target log probability density with categorical_logit_lupmf(y |
beta).
Available since 2.4
Stan functions
All of the categorical distributions are vectorized so that the outcome y can be a
single integer (type int) or an array of integers (type array[] int).
real categorical_lpmf(ints y | vector theta)
The log categorical probability mass function with outcome(s) y in 1 : N given N -
vector of outcome probabilities theta. The parameter theta must have non-negative
entries that sum to one, but it need not be a variable declared as a simplex.
Available since 2.12
See the definition of softmax for the definition of the softmax function.
Sampling statement
y ~ categorical_logit_glm(x, alpha, beta)
Increment target log probability density with categorical_logit_glm_lupmf(y |
x, alpha, beta).
Available since 2.23
Stan functions
real categorical_logit_glm_lpmf(int y | row_vector x, vector alpha,
matrix beta)
16.6. CATEGORICAL LOGIT GENERALIZED LINEAR MODEL (SOFTMAX REGRESSION)173
The log categorical probability mass function with outcome y in 1 : N given N -vector
of log-odds of outcomes alpha + x * beta.
Available since 2.23
Sampling statement
y ~ discrete_range(l, u)
Increment the target log probability density with discrete_range_lupmf(y | l,
u) dropping constant additive terms.
Available since 2.26
Stan functions
All of the discrete range distributions are vectorized so that the outcome y and the
bounds l, u can be a single integer (type int) or an array of integers (type array[]
int).
real discrete_range_lpmf(ints y | ints l, ints u)
The log probability mass function with outcome(s) y in l : u.
Available since 2.26
The k = K case is written with the redundant subtraction of zero to illustrate the
parallelism of the cases; the k = 1 and k = K edge cases can be subsumed into the
general definition by setting c0 = −∞ and cK = +∞ with logit−1 (−∞) = 0 and
logit−1 (∞) = 1.
Sampling statement
k ~ ordered_logistic(eta, c)
Increment target log probability density with ordered_logistic_lupmf(k | eta,
c).
Available since 2.0
Stan functions
real ordered_logistic_lpmf(ints k | vector eta, vectors c)
The log ordered logistic probability mass of k given linear predictors eta, and
cutpoints c.
Available since 2.18
The log ordered logistic probability mass of k given linear predictors eta, and
cutpoints c dropping constant additive terms.
Available since 2.25
The k = K case is written with the redundant subtraction of zero to illustrate the
parallelism of the cases; the y = 1 and y = K edge cases can be subsumed into the
general definition by setting c0 = −∞ and cK = +∞ with logit−1 (−∞) = 0 and
logit−1 (∞) = 1.
Sampling statement
y ~ ordered_logistic_glm(x, beta, c)
Increment target log probability density with ordered_logistic_lupmf(y | x,
beta, c).
Available since 2.23
Stan functions
real ordered_logistic_glm_lpmf(int y | row_vector x, vector beta,
vector c)
The log ordered logistic probability mass of y, given linear predictors x * beta, and
cutpoints c. The cutpoints c must be ordered.
Available since 2.23
vector c)
The log ordered logistic probability mass of y, given linear predictors x * beta, and
cutpoints c dropping constant additive terms. The cutpoints c must be ordered.
Available since 2.25
1 − Φ(η − c1 ) if k = 1,
OrderedProbit(k | η, c) = Φ(η − ck−1 ) − Φ(η − ck ) if 1 < k < K, and
Φ(η − cK−1 ) − 0 if k = K.
The k = K case is written with the redundant subtraction of zero to illustrate the
parallelism of the cases; the k = 1 and k = K edge cases can be subsumed into
the general definition by setting c0 = −∞ and cK = +∞ with Φ(−∞) = 0 and
Φ(∞) = 1.
Sampling statement
k ~ ordered_probit(eta, c)
Increment target log probability density with ordered_probit_lupmf(k | eta,
c).
Available since 2.19
Stan functions
real ordered_probit_lpmf(ints k | vector eta, vectors c)
The log ordered probit probability mass of k given linear predictors eta, and cutpoints
c.
Available since 2.18
Sampling statement
n ~ neg_binomial(alpha, beta)
Increment target log probability density with neg_binomial_lupmf(n | alpha,
beta).
Available since 2.0
Stan functions
real neg_binomial_lpmf(ints n | reals alpha, reals beta)
The log negative binomial probability mass of n given shape alpha and inverse scale
beta
Available since 2.12
180
17.2. NEGATIVE BINOMIAL DISTRIBUTION (ALTERNATIVE PARAMETERIZATION)181
µ2
E[n] = µ and Var[n] = µ + .
ϕ
182 CHAPTER 17. UNBOUNDED DISCRETE DISTRIBUTIONS
Stan functions
real neg_binomial_2_lpmf(ints n | reals mu, reals phi)
The log negative binomial probability mass of n given location mu and precision phi.
Available since 2.20
This alternative may be used for sampling, as a function, and for random number
generation, but as of yet, there are no CDFs implemented for it. This is especially
useful for log-linear negative binomial regressions.
Sampling statement
n ~ neg_binomial_2_log(eta, phi)
Increment target log probability density with neg_binomial_2_log_lupmf(n |
eta, phi).
Available since 2.3
Stan functions
real neg_binomial_2_log_lpmf(ints n | reals eta, reals phi)
The log negative binomial probability mass of n given log-location eta and inverse
overdispersion parameter phi.
Available since 2.20
Sampling statement
y ~ neg_binomial_2_log_glm(x, alpha, beta, phi)
Increment target log probability density with neg_binomial_2_log_glm_lupmf(y
| x, alpha, beta, phi).
Available since 2.19
Stan functions
real neg_binomial_2_log_glm_lpmf(int y | matrix x, real alpha, vector
beta, real phi)
The log negative binomial probability mass of y given log-location alpha + x *
beta and inverse overdispersion parameter phi.
Available since 2.23
Sampling statement
n ~ poisson(lambda)
Increment target log probability density with poisson_lupmf(n | lambda).
Available since 2.0
Stan functions
real poisson_lpmf(ints n | reals lambda)
The log Poisson probability mass of n given rate lambda
Available since 2.12
R poisson_rng(reals lambda)
Generate a Poisson variate with rate lambda; may only be used in transformed data
and generated quantities blocks. lambda must be less than 230 . For a description of
argument and return types, see section vectorized function signatures.
Available since 2.18
Sampling statement
n ~ poisson_log(alpha)
Increment target log probability density with poisson_log_lupmf(n | alpha).
Available since 2.0
Stan functions
real poisson_log_lpmf(ints n | reals alpha)
The log Poisson probability mass of n given log rate alpha
Available since 2.12
R poisson_log_rng(reals alpha)
Generate a Poisson variate with log rate alpha; may only be used in transformed data
and generated quantities blocks. alpha must be less than 30 log 2. For a description
of argument and return types, see section vectorized function signatures.
Available since 2.18
188 CHAPTER 17. UNBOUNDED DISCRETE DISTRIBUTIONS
Sampling statement
y ~ poisson_log_glm(x, alpha, beta)
Increment target log probability density with poisson_log_glm_lupmf(y | x,
alpha, beta).
Available since 2.19
Stan functions
real poisson_log_glm_lpmf(int y | matrix x, real alpha, vector beta)
The log Poisson probability mass of y given the log-rate alpha + x * beta.
Available since 2.23
The log Poisson probability mass of y given the log-rate alpha + x * beta.
Available since 2.23
Sampling statement
y ~ multinomial(theta)
Increment target log probability density with multinomial_lupmf(y | theta).
Available since 2.0
Stan functions
real multinomial_lpmf(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y of size K given
the K-simplex distribution parameter theta and (implicit) total count N = sum(y)
Available since 2.12
190
18.2. MULTINOMIAL DISTRIBUTION, LOGIT PARAMETERIZATION 191
count N ; may only be used in transformed data and generated quantities blocks
Available since 2.8
K
Y
N
MultinomialLogit(y | γ) = Multinomial(y | softmax(γ)) = [softmax(γk )]yk ,
y1 , . . . , yK
k=1
Sampling statement
y ~ multinomial_logit(gamma)
Increment target log probability density with multinomial_logit_lupmf(y |
gamma).
Available since 2.24
Stan functions
real multinomial_logit_lpmf(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y of size K given
the log K-simplex distribution parameter γ and (implicit) total count N = sum(y)
Available since 2.24
softmax(gamma) and total count N; may only be used in transformed data and
generated quantities blocks.
Available since 2.24
Continuous Distributions
193
19. Unbounded Continuous Distributions
The unbounded univariate continuous probability distributions have support on all
real numbers.
Sampling statement
y ~ normal(mu, sigma)
Increment target log probability density with normal_lupdf(y | mu, sigma).
Available since 2.0
Stan functions
real normal_lpdf(reals y | reals mu, reals sigma)
The log of the normal density of y given location mu and scale sigma
Available since 2.12
194
19.1. NORMAL DISTRIBUTION 195
−y 2
log Normal(y | 0, 1) = + const.
2
With no logarithm, no subtraction, and no division by a parameter, the standard
normal log density is much more efficient to compute than the normal log density
with constant location 0 and scale 1.
Sampling statement
y ~ std_normal()
Increment target log probability density with std_normal_lupdf(y).
Available since 2.19
196 CHAPTER 19. UNBOUNDED CONTINUOUS DISTRIBUTIONS
Stan functions
real std_normal_lpdf(reals y)
The standard normal (location zero, scale one) log probability density of y.
Available since 2.18
real std_normal_lupdf(reals y)
The standard normal (location zero, scale one) log probability density of y dropping
constant additive terms.
Available since 2.25
real std_normal_cdf(reals y)
The cumulative standard normal distribution of y; std_normal_cdf will underflow to
0 for y below -37.5 and overflow to 1 for y above 8.25; the function Phi_approx is
more robust in the tails.
Available since 2.21
real std_normal_lcdf(reals y)
The log of the cumulative standard normal distribution of y; std_normal_lcdf
will underflow to −∞ for y below -37.5 and overflow to 0 for y above 8.25;
log(Phi_approx(...)) is more robust in the tails.
Available since 2.21
real std_normal_lccdf(reals y)
The log of the complementary cumulative standard normal distribution of y;
std_normal_lccdf will overflow to 0 for y below -37.5 and underflow to −∞ for y
above 8.25; log1m(Phi_approx(...)) is more robust in the tails.
Available since 2.21
R std_normal_qf(T x)
Returns the value of the inverse standard normal cdf Φ−1 at the specified quantile x.
The std_normal_qf is equivalent to the inv_Phi function.
Available since 2.31
R std_normal_log_qf(T x)
Return the value of the inverse standard normal cdf Φ−1 evaluated at the log of the
specified quantile x. This function is equivalent to std_normal_qf(exp(x)) but is
more numerically stable.
Available since 2.31
real std_normal_rng()
Generate a normal variate with location zero and scale one; may only be used in
transformed data and generated quantities blocks.
19.2. NORMAL-ID GENERALIZED LINEAR MODEL (LINEAR REGRESSION) 197
Sampling statement
y ~ normal_id_glm(x, alpha, beta, sigma)
Increment target log probability density with normal_id_glm_lupdf(y | x,
alpha, beta, sigma).
Available since 2.19
Stan functions
real normal_id_glm_lpdf(real y | matrix x, real alpha, vector beta,
real sigma)
The log normal probability density of y given location alpha + x * beta and scale
sigma.
Available since 2.29
The log normal probability density of y given location alpha + x * beta and scale
sigma dropping constant additive terms.
Available since 2.29
The log normal probability density of y given location alpha + x * beta and scale
sigma.
Available since 2.29
The log normal probability density of y given location alpha + x * beta and scale
sigma dropping constant additive terms.
Available since 2.30
µ + λσ 2 − y
λ λ 2
ExpModNormal(y|µ, σ, λ) = exp 2µ + λσ − 2y erfc √ .
2 2 2σ
Sampling statement
y ~ exp_mod_normal(mu, sigma, lambda)
Increment target log probability density with exp_mod_normal_lupdf(y | mu,
sigma, lambda).
Available since 2.0
Stan functions
real exp_mod_normal_lpdf(reals y | reals mu, reals sigma, reals
lambda)
The log of the exponentially modified normal density of y given location mu, scale
sigma, and shape lambda
Available since 2.18
Sampling statement
y ~ skew_normal(xi, omega, alpha)
Increment target log probability density with skew_normal_lupdf(y | xi, omega,
alpha).
Available since 2.0
202 CHAPTER 19. UNBOUNDED CONTINUOUS DISTRIBUTIONS
Stan functions
real skew_normal_lpdf(reals y | reals xi, reals omega, reals alpha)
The log of the skew normal density of y given location xi, scale omega, and shape
alpha
Available since 2.16
Sampling statement
y ~ student_t(nu, mu, sigma)
Increment target log probability density with student_t_lupdf(y | nu, mu,
sigma).
Available since 2.0
Stan functions
real student_t_lpdf(reals y | reals nu, reals mu, reals sigma)
The log of the Student-t density of y given degrees of freedom nu, location mu, and
scale sigma
Available since 2.12
Sampling statement
y ~ cauchy(mu, sigma)
Increment target log probability density with cauchy_lupdf(y | mu, sigma).
Available since 2.0
Stan functions
real cauchy_lpdf(reals y | reals mu, reals sigma)
The log of the Cauchy density of y given location mu and scale sigma
Available since 2.12
Note that the double exponential distribution is parameterized in terms of the scale,
in contrast to the exponential distribution (see section exponential distribution),
which is parameterized in terms of inverse scale.
The double-exponential distribution can be defined as a compound exponential-
normal distribution (Ding and Blitzstein 2018). Using the inverse scale parameteri-
zation for the exponential distribution, and the standard deviation parameterization
for the normal distribution, one can write
1
α ∼ Exponential
2σ 2
and √
β | α ∼ Normal(µ, α),
then
β ∼ DoubleExponential(µ, σ).
This may be used to code a non-centered parameterization by taking
β raw ∼ Normal(0, 1)
and defining
β = µ + α β raw .
Sampling statement
y ~ double_exponential(mu, sigma)
Increment target log probability density with double_exponential_lupdf(y | mu,
sigma).
Available since 2.0
206 CHAPTER 19. UNBOUNDED CONTINUOUS DISTRIBUTIONS
Stan functions
real double_exponential_lpdf(reals y | reals mu, reals sigma)
The log of the double exponential density of y given location mu and scale sigma
Available since 2.12
Sampling statement
y ~ logistic(mu, sigma)
19.9. GUMBEL DISTRIBUTION 207
Stan functions
real logistic_lpdf(reals y | reals mu, reals sigma)
The log of the logistic density of y given location mu and scale sigma
Available since 2.12
Sampling statement
y ~ gumbel(mu, beta)
208 CHAPTER 19. UNBOUNDED CONTINUOUS DISTRIBUTIONS
Stan functions
real gumbel_lpdf(reals y | reals mu, reals beta)
The log of the gumbel density of y given location mu and scale beta
Available since 2.12
SkewDoubleExponential(y|µ, σ, τ ) =
2τ (1 − τ ) 2
exp − [(1 − τ ) I(y < µ)(µ − y) + τ I(y > µ)(y − µ)]
σ σ
19.10. SKEW DOUBLE EXPONENTIAL DISTRIBUTION 209
Sampling statement
y ~ skew_double_exponential(mu, sigma, tau)
Increment target log probability density with skew_double_exponential(y | mu,
sigma, tau)
Available since 2.28
Stan functions
real skew_double_exponential_lpdf(reals y | reals mu, reals sigma,
reals tau)
The log of the skew double exponential density of y given location mu, scale sigma
and skewness tau
Available since 2.28
functions.
Available since 2.28
20. Positive Continuous Distributions
The positive continuous probability functions have support on the positive real
numbers.
Sampling statement
y ~ lognormal(mu, sigma)
Increment target log probability density with lognormal_lupdf(y | mu, sigma).
Available since 2.0
Stan functions
real lognormal_lpdf(reals y | reals mu, reals sigma)
The log of the lognormal density of y given location mu and scale sigma
Available since 2.12
211
212 CHAPTER 20. POSITIVE CONTINUOUS DISTRIBUTIONS
2−ν/2 ν/2−1
1
ChiSquare(y|ν) = y exp − y .
Γ(ν/2) 2
Sampling statement
y ~ chi_square(nu)
Increment target log probability density with chi_square_lupdf(y | nu).
Available since 2.0
Stan functions
real chi_square_lpdf(reals y | reals nu)
The log of the Chi-square density of y given degrees of freedom nu
Available since 2.12
freedom nu
Available since 2.12
R chi_square_rng(reals nu)
Generate a Chi-square variate with degrees of freedom nu; may only be used in
transformed data and generated quantities blocks. For a description of argument
and return types, see section vectorized PRNG functions.
Available since 2.18
2−ν/2 −ν/2−1
1 1
InvChiSquare(y | ν) = y exp − .
Γ(ν/2) 2 y
Sampling statement
y ~ inv_chi_square(nu)
Increment target log probability density with inv_chi_square_lupdf(y | nu).
Available since 2.0
Stan functions
real inv_chi_square_lpdf(reals y | reals nu)
The log of the inverse Chi-square density of y given degrees of freedom nu
Available since 2.12
R inv_chi_square_rng(reals nu)
Generate an inverse Chi-squared variate with degrees of freedom nu; may only
be used in transformed data and generated quantities blocks. For a description of
argument and return types, see section vectorized PRNG functions.
Available since 2.18
(ν/2)ν/2 ν −(ν/2+1)
1 2 1
ScaledInvChiSquare(y|ν, σ) = σ y exp − ν σ .
Γ(ν/2) 2 y
Sampling statement
y ~ scaled_inv_chi_square(nu, sigma)
Increment target log probability density with scaled_inv_chi_square_lupdf(y |
nu, sigma).
Available since 2.0
Stan functions
real scaled_inv_chi_square_lpdf(reals y | reals nu, reals sigma)
The log of the scaled inverse Chi-square density of y given degrees of freedom nu
and scale sigma
Available since 2.12
Sampling statement
y ~ exponential(beta)
Increment target log probability density with exponential_lupdf(y | beta).
Available since 2.0
Stan functions
real exponential_lpdf(reals y | reals beta)
The log of the exponential density of y given inverse scale beta
Available since 2.12
additive terms
Available since 2.25
R exponential_rng(reals beta)
Generate an exponential variate with inverse scale beta; may only be used in
transformed data and generated quantities blocks. For a description of argument
and return types, see section vectorized PRNG functions.
Available since 2.18
Sampling statement
y ~ gamma(alpha, beta)
Increment target log probability density with gamma_lupdf(y | alpha, beta).
Available since 2.0
Stan functions
real gamma_lpdf(reals y | reals alpha, reals beta)
The log of the gamma density of y given shape alpha and inverse scale beta
Available since 2.12
20.7. INVERSE GAMMA DISTRIBUTION 217
β α −(α+1)
1
InvGamma(y|α, β) = y exp −β .
Γ(α) y
Sampling statement
y ~ inv_gamma(alpha, beta)
Increment target log probability density with inv_gamma_lupdf(y | alpha,
beta).
Available since 2.0
218 CHAPTER 20. POSITIVE CONTINUOUS DISTRIBUTIONS
Stan functions
real inv_gamma_lpdf(reals y | reals alpha, reals beta)
The log of the inverse gamma density of y given shape alpha and scale beta
Available since 2.12
Stan functions
real weibull_lpdf(reals y | reals alpha, reals sigma)
The log of the Weibull density of y given shape alpha and scale sigma
Available since 2.12
Sampling statement
y ~ frechet(alpha, sigma)
Increment target log probability density with frechet_lupdf(y | alpha, sigma).
Available since 2.5
Stan functions
real frechet_lpdf(reals y | reals alpha, reals sigma)
The log of the Frechet density of y given shape alpha and scale sigma
Available since 2.12
Sampling statement
y ~ rayleigh(sigma)
Increment target log probability density with rayleigh_lupdf(y | sigma).
Available since 2.0
Stan functions
real rayleigh_lpdf(reals y | reals sigma)
The log of the Rayleigh density of y given scale sigma
Available since 2.12
R rayleigh_rng(reals sigma)
Generate a Rayleigh variate with scale sigma; may only be used in generated quanti-
ties block. For a description of argument and return types, see section vectorized
PRNG functions.
Available since 2.18
Sampling statement
y ~ loglogistic(alpha, beta)
Increment target log probability density with unnormalized version of
loglogistic_lpdf(y | alpha, beta)
Available since 2.29
Stan functions
real loglogistic_lpdf(reals y | reals alpha, reals beta)
The log of the log-logistic density of y given scale alpha and shape beta
Available since 2.29
Sampling statement
y ~ pareto(y_min, alpha)
Increment target log probability density with pareto_lupdf(y | y_min, alpha).
Available since 2.0
Stan functions
real pareto_lpdf(reals y | reals y_min, reals alpha)
The log of the Pareto density of y given positive minimum value y_min and shape
alpha
Available since 2.12
223
224 CHAPTER 21. POSITIVE LOWER-BOUNDED DISTRIBUTIONS
Stan functions
real pareto_type_2_lpdf(reals y | reals mu, reals lambda, reals
alpha)
The log of the Pareto Type 2 density of y given location mu, scale lambda, and shape
alpha
Available since 2.18
The Pareto Type 2 cumulative distribution function of y given location mu, scale
lambda, and shape alpha
Available since 2.5
where ϕ(x) denotes the standard normal density function; see (Feller 1968),
(Navarro and Fuss 2009).
Sampling statement
y ~ wiener(alpha, tau, beta, delta)
Increment target log probability density with wiener_lupdf(y | alpha, tau,
beta, delta).
Available since 2.7
Stan functions
real wiener_lpdf(reals y | reals alpha, reals tau, reals beta, reals
delta)
226 CHAPTER 21. POSITIVE LOWER-BOUNDED DISTRIBUTIONS
The log of the Wiener first passage time density of y given boundary separation
alpha, non-decision time tau, a-priori bias beta and drift rate delta
Available since 2.18
boundaries
Stan returns the first passage time of the accumulation process over the upper
boundary only. To get the result for the lower boundary, use
wiener(y|α, τ, 1 − β, −δ)
For more details, see the appendix of Vandekerckhove and Wabersich (2014).
22. Continuous Distributions on [0, 1]
The continuous distributions with outcomes in the interval [0, 1] are used to charac-
terized bounded quantities, including probabilities.
Stan functions
real beta_lpdf(reals theta | reals alpha, reals beta)
The log of the beta density of theta in [0, 1] given positive prior successes (plus one)
alpha and prior failures (plus one) beta
Available since 2.12
227
228 CHAPTER 22. CONTINUOUS DISTRIBUTIONS ON [0, 1]
Stan functions
real beta_proportion_lpdf(reals theta | reals mu, reals kappa)
The log of the beta_proportion density of theta in (0, 1) given mean mu and precision
kappa
Available since 2.19
22.2. BETA PROPORTION DISTRIBUTION 229
Stan functions
real von_mises_lpdf(reals y | reals mu, reals kappa)
The log of the von mises density of y given location mu and scale kappa.
Available since 2.18
230
23.1. VON MISES DISTRIBUTION 231
Numerical stability
Evaluating the Von Mises distribution for κ > 100 is numerically unstable in the
current implementation. Nathanael I. Lichti suggested the following workaround on
the Stan users group, based on the fact that as κ → ∞,
p
VonMises(y|µ, κ) → Normal(µ, 1/κ).
Sampling statement
y ~ uniform(alpha, beta)
Increment target log probability density with uniform_lupdf(y | alpha, beta).
Available since 2.0
Stan functions
real uniform_lpdf(reals y | reals alpha, reals beta)
The log of the uniform density of y given lower bound alpha and upper bound beta
Available since 2.12
232
24.1. UNIFORM DISTRIBUTION 233
Stan functions
The multivariate normal probability function is overloaded to allow the variate
vector y and location vector µ to be vectors or row vectors (or to mix the two
types). The density function is also vectorized, so it allows arrays of row vectors or
vectors as arguments; see section vectorized function signatures for a description of
vectorization.
real multi_normal_lpdf(vectors y | vectors mu, matrix Sigma)
The log of the multivariate normal density of vector(s) y given location vector(s) mu
and covariance matrix Sigma
Available since 2.12
234
25.1. MULTIVARIATE NORMAL DISTRIBUTION 235
Although there is a direct multi-normal RNG function, if more than one result is
required, it’s much more efficient to Cholesky factor the covariance matrix and call
multi_normal_cholesky_rng; see section multi-variate normal, cholesky parame-
terization.
vector multi_normal_rng(vector mu, matrix Sigma)
Generate a multivariate normal variate with location mu and covariance matrix
Sigma; may only be used in transformed data and generated quantities blocks
Available since 2.0
Sigma; may only be used in transformed data and generated quantities blocks
Available since 2.18
Sampling statement
y ~ multi_normal_prec(mu, Omega)
Increment target log probability density with multi_normal_prec_lupdf(y | mu,
Omega).
Available since 2.3
Stan functions
real multi_normal_prec_lpdf(vectors y | vectors mu, matrix Omega)
The log of the multivariate normal density of vector(s) y given location vector(s) mu
and positive definite precision matrix Omega
Available since 2.18
If L is lower triangular and LLtop is a K × K positive definite matrix, then Lk,k must
be strictly positive for k ∈ 1:K. If an L is provided that is not the Cholesky factor of
a positive-definite matrix, the probability functions will raise errors.
Sampling statement
y ~ multi_normal_cholesky(mu, L)
Increment target log probability density with multi_normal_cholesky_lupdf(y |
mu, L).
Available since 2.0
Stan functions
real multi_normal_cholesky_lpdf(vectors y | vectors mu, matrix L)
The log of the multivariate normal density of vector(s) y given location vector(s) mu
and lower-triangular Cholesky factor of the covariance matrix L
Available since 2.18
The log of the multivariate normal density of row vector(s) y given location vector(s)
mu and lower-triangular Cholesky factor of the covariance matrix L dropping
constant additive terms
Available since 2.25
where yi is the ith row of y. This is used to efficiently handle Gaussian Processes
with multi-variate outputs where only the output dimensions share a kernel function
but vary based on their scale. Note that this function does not take into account the
mean prediction.
Sampling statement
y ~ multi_gp(Sigma, w)
Increment target log probability density with multi_gp_lupdf(y | Sigma, w).
Available since 2.3
Stan functions
real multi_gp_lpdf(matrix y | matrix Sigma, vector w)
The log of the multivariate GP density of matrix y given kernel matrix Sigma and
inverses scales w
Available since 2.12
where yi is the ith row of y. This is used to efficiently handle Gaussian Processes with
multi-variate outputs where only the output dimensions share a kernel function but
vary based on their scale. If the model allows parameterization in terms of Cholesky
factor of the kernel matrix, this distribution is also more efficient than MultiGP().
Note that this function does not take into account the mean prediction.
Sampling statement
y ~ multi_gp_cholesky(L, w)
Increment target log probability density with multi_gp_cholesky_lupdf(y | L,
w).
Available since 2.5
Stan functions
real multi_gp_cholesky_lpdf(matrix y | matrix L, vector w)
The log of the multivariate GP density of matrix y given lower-triangular Cholesky
factor of the kernel matrix L and inverses scales w
Available since 2.12
MultiStudentT(y | ν, µ, Σ)
−(ν+K)/2
Γ((ν+K)/2) ⊤
= 1
π K/2
1
ν K/2 Γ(ν/2)
√1 1+ 1
ν (y − µ) Σ−1 (y − µ) .
|Σ|
Sampling statement
y ~ multi_student_t(nu, mu, Sigma)
Increment target log probability density with multi_student_t_lupdf(y | nu,
mu, Sigma).
Available since 2.0
242 CHAPTER 25. DISTRIBUTIONS OVER UNBOUNDED VECTORS
Stan functions
real multi_student_t_lpdf(vectors y | real nu, vectors mu, matrix
Sigma)
The log of the multivariate Student-t density of vector(s) y given degrees of freedom
nu, location vector(s) mu, and scale matrix Sigma
Available since 2.18
The log of the multivariate Student-t density of row vector(s) y given degrees of
freedom nu, location row vector(s) mu, and scale matrix Sigma
Available since 2.18
Sampling statement
y ~ multi_student_t_cholesky(nu, mu, L)
Increment target log probability density with multi_student_t_cholesky_lupdf(y
| nu, mu, L).
Available since 2.30
Stan functions
real multi_student_t_cholesky_lpdf(vectors y | real nu, vectors mu,
matrix L)
The log of the multivariate Student-t density of vector or array of vectors y given
degrees of freedom nu, location vector or array of vectors mu, and Cholesky factor of
the scale matrix L. For a definition of the arguments compatible with the vectors
type, see the probability vectorization section.
Available since 2.30
mu, matrix L)
Generate a multivariate Student-t variate with degrees of freedom nu, location array
mu, and Cholesky factor of the scale matrix L; may only be used in transformed data
and generated quantities blocks.
Available since 2.30
yt ∼ N (F ′ θt , V )
θt ∼ N (Gθt−1 , W )
θ0 ∼ N (m0 , C0 )
where y is n × T matrix where rows are variables and columns are observations.
These functions calculate the log-likelihood of the observations marginalizing over
the latent states (p(y|F, G, V, W, m0 , C0 )). This log-likelihood is a system that is
calculated using the Kalman Filter. If V is diagonal, then a more efficient algorithm
which sequentially processes observations and avoids a matrix inversions can be
used (Durbin and Koopman 2001, sec. 6.4).
Sampling statement
y ~ gaussian_dlm_obs(F, G, V, W, m0, C0)
Increment target log probability density with gaussian_dlm_obs_lupdf(y | F, G,
V, W, m0, C0).
Available since 2.0
Stan functions
The following two functions differ in the type of their V, the first taking a full
observation covariance matrix V and the second a vector V representing the diagonal
of the observation covariance matrix. The sampling statement defined in the previous
section works with either type of observation V.
real gaussian_dlm_obs_lpdf(matrix y | matrix F, matrix G, matrix V,
246 CHAPTER 25. DISTRIBUTIONS OVER UNBOUNDED VECTORS
Taking K = 10, here are the first five draws for α = 1. For α = 1, the distribution is
uniform over simplexes.
1) 0.17 0.05 0.07 0.17 0.03 0.13 0.03 0.03 0.27 0.05
2) 0.08 0.02 0.12 0.07 0.52 0.01 0.07 0.04 0.01 0.06
3) 0.02 0.03 0.22 0.29 0.17 0.10 0.09 0.00 0.05 0.03
4) 0.04 0.03 0.21 0.13 0.04 0.01 0.10 0.04 0.22 0.18
5) 0.11 0.22 0.02 0.01 0.06 0.18 0.33 0.04 0.01 0.01
247
248 CHAPTER 26. SIMPLEX DISTRIBUTIONS
That does not mean it’s uniform over the marginal probabilities of each element.
As the size of the simplex grows, the marginal draws become more and more
concentrated below (not around) 1/K. When one component of the simplex is large,
the others must all be relatively small to compensate. For example, in a uniform
distribution on 10-simplexes, the probability that a component is greater than the
mean of 1/10 is only 39%. Most of the posterior marginal probability mass for each
component is in the interval (0, 0.1).
When the α value is small, the draws gravitate to the corners of the simplex. Here
are the first five draws for α = 0.001.
1) 3e-203 0e+00 2e-298 9e-106 1e+000 0e+00 0e+000 1e-047 0e+00 4e-279
2) 1e+000 0e+00 5e-279 2e-014 1e-275 0e+00 3e-285 9e-147 0e+00 0e+000
3) 1e-308 0e+00 1e-213 0e+000 0e+000 8e-75 0e+000 1e+000 4e-58 7e-112
4) 6e-166 5e-65 3e-068 3e-147 0e+000 1e+00 3e-249 0e+000 0e+00 0e+000
5) 2e-091 0e+00 0e+000 0e+000 1e-060 0e+00 4e-312 1e+000 0e+00 0e+000
Each row denotes a draw. Each draw has a single value that rounds to one and other
values that are very close to zero or rounded down to zero.
As α increases, the draws become increasingly uniform. For α = 1000,
1) 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10
2) 0.10 0.10 0.09 0.10 0.10 0.10 0.11 0.10 0.10 0.10
3) 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10
4) 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10
5) 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10
Sampling statement
theta ~ dirichlet(alpha)
Increment target log probability density with dirichlet_lupdf(theta | alpha).
Available since 2.0
Stan functions
The Dirichlet probability functions are overloaded to allow the simplex θ and prior
counts (plus one) α to be vectors or row vectors (or to mix the two types). The
density functions are also vectorized, so they allow arrays of row vectors or vec-
tors as arguments; see section vectorized function signatures for a description of
vectorization.
real dirichlet_lpdf(vectors theta | vectors alpha)
The log of the Dirichlet density for simplex(es) theta given prior counts (plus one)
26.1. DIRICHLET DISTRIBUTION 249
alpha
Available since 2.12, vectorized in 2.21
250
27.2. CHOLESKY LKJ CORRELATION DISTRIBUTION 251
Sampling statement
y ~ lkj_corr(eta)
Increment target log probability density with lkj_corr_lupdf(y | eta).
Available since 2.3
Stan functions
real lkj_corr_lpdf(matrix y | real eta)
The log of the LKJ density for the correlation matrix y given nonnegative shape eta.
lkj_corr_cholesky_lpdf is faster, more numerically stable, uses less memory, and
should be preferred to this.
Available since 2.12
Because Stan requires models to have support on all valid constrained parameters, L
will almost always1 be a parameter declared with the type of a Cholesky factor for a
correlation matrix; for example,
parameters { cholesky_factor_corr[K] L; # rather than corr_matrix[K] Si
1 It is possible to build up a valid L within Stan, but that would then require Jacobian adjustments to
See the previous section for details on interpreting the shape parameter η. Note that
even if η = 1, it is still essential to evaluate the density function because the density
of L is not constant, regardless of the value of η, even though the density of LL⊤ is
constant iff η = 1.
A lower triangular L is a Cholesky factor for a correlation matrix if and only if
Lk,k > 0 for k ∈ 1:K and each row Lk has unit Euclidean length.
Sampling statement
L ~ lkj_corr_cholesky(eta)
Increment target log probability density with lkj_corr_cholesky_lupdf(L |
eta).
Available since 2.4
Stan functions
real lkj_corr_cholesky_lpdf(matrix L | real eta)
The log of the LKJ density for the lower-triangular Cholesky factor L of a correlation
matrix given shape eta
Available since 2.12
where tr() is the matrix trace function, and ΓK () is the multivariate Gamma function,
K
1 Y 1−k
ΓK (x) = Γ x+ .
π K(K−1)/4 k=1
2
Sampling statement
W ~ wishart(nu, Sigma)
Increment target log probability density with wishart_lupdf(W | nu, Sigma).
Available since 2.0
Stan functions
real wishart_lpdf(matrix W | real nu, matrix Sigma)
Return the log of the Wishart density for symmetric and positive-definite matrix W
given degrees of freedom nu and symmetric and positive-definite scale matrix Sigma.
Available since 2.12
253
254 CHAPTER 28. COVARIANCE MATRIX DISTRIBUTIONS
LW ∼ WishartCholesky(ν, LS )
if and only if
W ∼ Wishart(ν, S).
WishartCholesky(LW | ν, LS ) = Wishart(LW L⊤ ⊤
W | ν, LS LS ) Jf −1 ,
K
X
log Jf −1 = K log(2) + (K − k + 1) log (LW )k, k .
k=1
given degrees of freedom nu and lower-triangular Cholesky factor of the scale matrix
L_S dropping constant additive terms.
Available since 2.30
Sampling statement
W ~ inv_wishart(nu, Sigma)
Increment target log probability density with inv_wishart_lupdf(W | nu,
Sigma).
Available since 2.0
Stan functions
real inv_wishart_lpdf(matrix W | real nu, matrix Sigma)
Return the log of the inverse Wishart density for symmetric and positive-definite
matrix W given degrees of freedom nu and symmetric and positive-definite scale
matrix Sigma.
Available since 2.12
LW ∼ InvWishartCholesky(ν, LS )
if and only if
W ∼ InvWishart(ν, S).
InvWishartCholesky(LW | ν, LS ) = InvWishart(LW L⊤ ⊤
W | ν, LS LS ) Jf −1 ,
K
X
log Jf −1 = K log(2) + (K − k + 1) log (LW )k, k .
k=1
L_W given degrees of freedom nu and lower-triangular Cholesky factor of the scale
matrix L_S dropping constant additive terms.
Available since 2.30
258
29. Hidden Markov Models
An elementary first-order Hidden Markov model is a probabilistic model over N
observations, yn , and N hidden states, xn , which can be fully defined by the condi-
tional distributions p(yn | xn , ϕ) and p(xn | xn−1 , ϕ). Here we make the dependency
on additional model parameters, ϕ, explicit. When x is continuous, the user can
explicitly encode these distributions in Stan and use Markov chain Monte Carlo to
integrate x out.
When each state x takes a value over a discrete and finite set, say {1, 2, ..., K}, we
can take advantage of the dependency structure to marginalize x and compute
p(y | ϕ). We start by defining the conditional observational distribution, stored in a
K × N matrix ω with
ωkn = p(yn | xn = k, ϕ).
Next, we introduce the K × K transition matrix, Γ, with
Each row defines a probability distribution and must therefore be a simplex (i.e. its
components must add to 1). Currently, Stan only supports stationary transitions
where a single transition matrix is used for all transitions. Finally we define the
initial state K-vector ρ, with
ρk = p(x0 = k | ϕ).
The Stan functions that support this type of model are special in that the user does
not explicitly pass y and ϕ as arguments. Instead, the user passes log ω, Γ, and ρ,
which in turn depend on y and ϕ.
The arguments represent (1) the log density of each output, (2) the transition matrix,
and (3) the initial state vector.
259
260 CHAPTER 29. HIDDEN MARKOV MODELS
• log_omega : log ωkn = log p(yn | xn = k, ϕ), log density of each output,
• Gamma : Γij = p(xn = j|xn−1 = i, ϕ), the transition matrix,
• rho : ρk = p(x0 = k | ϕ), the initial state probability.
array[] int hmm_latent_rng(matrix log_omega, matrix Gamma, vector
rho)
Returns a length N array of integers over {1, ..., K}, sampled from the joint posterior
distribution of the hidden states, p(x | ϕ, y). May be only used in transformed data
and generated quantities.
Available since 2.24
261
30. Mathematical Functions
This appendix provides the definition of several mathematical functions used through-
out the manual.
30.1. Beta
The beta function, B(a, b), computes the normalizing constant for the beta distribu-
tion, and is defined for a > 0 and b > 0 by
Z 1
Γ(a) Γ(b)
B(a, b) = ua−1 (1 − u)b−1 du = ,
0 Γ(a + b)
where B(a, b) is the beta function defined in appendix. If x = 1, the incomplete beta
function reduces to the beta function, B(1; a, b) = B(a, b).
The regularized incomplete beta function divides the incomplete beta function by
the beta function,
B(x; a, b)
Ix (a, b) = .
B(a, b)
30.3. Gamma
The gamma function, Γ(x), is the generalization of the factorial function to continu-
ous variables, defined so that for positive integers n,
Γ(n + 1) = n!
262
30.4. DIGAMMA 263
30.4. Digamma
The digamma function Ψ is the derivative of the log Γ function,
d 1 d
Ψ(u) = log Γ(u) = Γ(u).
du Γ(u) du
References
Bailey, David H., Karthik Jeyabalan, and Xiaoye S. Li. 2005. “A Comparison of
Three High-Precision Quadrature Schemes.” Experiment. Math. 14 (3): 317–29.
https://projecteuclid.org:443/euclid.em/1128371757.
Bowling, Shannon R., Mohammad T. Khasawneh, Sittichai Kaewkuekool, and Byung
Rae Cho. 2009. “A Logistic Approximation to the Cumulative Normal Distribu-
tion.” Journal of Industrial Engineering and Management 2 (1): 114–27.
Ding, Peng, and Joseph K. Blitzstein. 2018. “On the Gaussian Mixture Representation
of the Laplace Distribution.” The American Statistician 72 (2): 172–74. https:
//doi.org/10.1080/00031305.2017.1291448.
Durbin, J., and S. J. Koopman. 2001. Time Series Analysis by State Space Methods.
New York: Oxford University Press.
Feller, William. 1968. An Introduction to Probability Theory and Its Applications. Vol.
1. 3. Wiley, New York.
Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald
B. Rubin. 2013. Bayesian Data Analysis. Third Edition. London: Chapman & Hall
/ CRC Press.
Golub, G. H., and V. Pereyra. 1973. “The Differentiation of Pseudo-Inverses and
Nonlinear Least Squares Problems Whose Variables Separate.” SIAM Journal on
Numerical Analysis 10 (2): 413–32. https://doi.org/10.1137/0710036.
Guennebaud, Gaël, Benoît Jacob, et al. 2010. “Eigen V3.” http://eigen.tuxfamily.org.
Hindmarsh, Alan C, Peter N Brown, Keith E Grant, Steven L Lee, Radu Serban, Dan
E Shumaker, and Carol S Woodward. 2005. “SUNDIALS: Suite of Nonlinear
and Differential/Algebraic Equation Solvers.” ACM Transactions on Mathematical
Software (TOMS) 31 (3): 363–96.
Jorge J. More, Kenneth E. Hillstrom, Burton S. Garbow. 1980. User Guide for
MINPACK-1. 9700 South Cass Avenue, Argonne, Illinois 60439: Argonne National
Laboratory.
Lewandowski, Daniel, Dorota Kurowicka, and Harry Joe. 2009. “Generating Random
Correlation Matrices Based on Vines and Extended Onion Method.” Journal of
Multivariate Analysis 100: 1989–2001.
Margossian, Charles C, and Michael Betancourt. 2022. “Efficient Automatic Differen-
tiation of Implicit Functions.” Preprint. arXiv:2112.14217.
Mori, Masatake. 1978. “An IMT-Type Double Exponential Formula for Numerical
Integration.” Publications of the Research Institute for Mathematical Sciences 14
264
30.4. DIGAMMA 265
266
INDEX 267
beta_binomial_lpmf binomia_lccdf
(ints n | ints N, reals alpha, reals (ints n | ints N, reals theta):
beta): real, 169 real, 167
beta_binomial_lupmf binomia_lcdf
(ints n | ints N, reals alpha, reals (ints n | ints N, reals theta):
beta): real, 169 real, 167
beta_binomial_rng binomia_lpmf
(ints N, reals alpha, reals beta): (ints n | ints N, reals theta):
R, 169 real, 166
beta_cdf binomia_lupmf
(reals theta, reals alpha, reals (ints n | ints N, reals theta):
beta): real, 227 real, 166
beta_lccdf binomial
(reals theta | reals alpha, reals sampling statement, 166
beta): real, 228 binomial_logit
beta_lcdf sampling statement, 168
(reals theta | reals alpha, reals binomial_logit_lpmf
beta): real, 227 (ints n | ints N, reals alpha):
beta_lpdf real, 168
(reals theta | reals alpha, reals binomial_logit_lupmf
beta): real, 227 (ints n | ints N, reals alpha):
beta_lupdf real, 168
(reals theta | reals alpha, reals binomial_rng
beta): real, 227 (ints N, reals theta): R, 167
beta_proportion block
sampling statement, 228 (complex_matrix x, int i, int
beta_proportion_lccdf j, int n_rows, int n_cols):
(reals theta | reals mu, reals complex_matrix, 108
kappa): real, 229 (matrix x, int i, int j, int n_rows,
beta_proportion_lcdf int n_cols): matrix, 75
(reals theta | reals mu, reals categorical
kappa): real, 229 sampling statement, 171
beta_proportion_lpdf categorical_logit
(reals theta | reals mu, reals sampling statement, 171
kappa): real, 228 categorical_logit_glm
beta_proportion_lupdf sampling statement, 172
(reals theta | reals mu, reals categorical_logit_glm_lpmf
kappa): real, 228 (array[] int y | matrix x, vector
beta_proportion_rng alpha, matrix beta): real, 173
(reals mu, reals kappa): R, 229 (array[] int y | row_vector x,
beta_rng vector alpha, matrix beta):
(reals alpha, reals beta): R, 228 real, 173
binary_log_loss (int y | matrix x, vector alpha,
(T1 x, T2 y): R, 28 matrix beta): real, 173
(int y, real y_hat): real, 28 (int y | row_vector x, vector alpha,
binomia_cdf matrix beta): real, 172
(ints n, ints N, reals theta): real, categorical_logit_glm_lupmf
167 (array[] int y | matrix x, vector
INDEX 269
alpha, matrix beta): real, 173 (reals y | reals nu): real, 213
(array[] int y | row_vector x, chi_square_lcdf
vector alpha, matrix beta): (reals y | reals nu): real, 212
real, 173 chi_square_lpdf
(int y | matrix x, vector alpha, (reals y | reals nu): real, 212
matrix beta): real, 173 chi_square_lupdf
(int y | row_vector x, vector alpha, (reals y | reals nu): real, 212
matrix beta): real, 173 chi_square_rng
categorical_logit_lpmf (reals nu): R, 213
(ints y | vector beta): real, 171 chol2inv
categorical_logit_lupmf (matrix L): matrix, 89
(ints y | vector beta): real, 171 cholesky_decompose
categorical_logit_rng (matrix A): matrix, 92
(vector beta): int, 172 choose
categorical_lpmf (T1 x, T2 y): R, 31
(ints y | vector theta): real, 171 (int x, int y): int, 31
categorical_lupmf col
(ints y | vector theta): real, 171 (complex_matrix x, int n):
categorical_rng complex_vector, 108
(vector theta): int, 171 (matrix x, int n): vector, 74
cauchy cols
sampling statement, 204 (complex_matrix x): int, 97
cauchy_cdf (complex_row_vector x): int, 96
(reals y, reals mu, reals sigma): (complex_vector x): int, 96
real, 204 (matrix x): int, 57
cauchy_lccdf (row_vector x): int, 57
(reals y | reals mu, reals sigma): (vector x): int, 57
real, 204 columns_dot_product
cauchy_lcdf (complex_matrix x, complex_matrix
(reals y | reals mu, reals sigma): y): complex_row_vector, 103
real, 204 (complex_row_vector x,
cauchy_lpdf complex_row_vector y):
(reals y | reals mu, reals sigma): complex_row_vector, 103
real, 204 (complex_vector x, complex_vector
cauchy_lupdf y): complex_row_vector, 103
(reals y | reals mu, reals sigma): (matrix x, matrix y): row_vector, 64
real, 204 (row_vector x, row_vector y):
cauchy_rng row_vector, 64
(reals mu, reals sigma): R, 204 (vector x, vector y): row_vector, 64
cbrt columns_dot_self
(T x): R, 24 (complex_matrix x):
ceil complex_row_vector, 104
(T x): R, 23 (complex_row_vector x):
chi_square complex_row_vector, 104
sampling statement, 212 (complex_vector x):
chi_square_cdf complex_row_vector, 104
(reals y, reals nu): real, 212 (matrix x): row_vector, 65
chi_square_lccdf (row_vector x): row_vector, 65
270 INDEX
frechet_lupdf get_imag
(reals y | reals alpha, reals (T x): T_demoted, 106
sigma): real, 220 (complex z): real, 39
frechet_rng get_real
(reals alpha, reals sigma): R, 220 (T x): T_demoted, 106
gamma (complex z): real, 39
sampling statement, 216 gp_dot_prod_cov
gamma_cdf (array[] real x, real sigma):
(reals y, reals alpha, reals beta): matrix, 81
real, 217 (array[] real x1, array[] real x2,
gamma_lccdf real sigma): matrix, 81
(reals y | reals alpha, reals beta): (vectors x, real sigma): matrix, 81
real, 217 (vectors x1, vectors x2, real
gamma_lcdf sigma): matrix, 81
(reals y | reals alpha, reals beta): gp_exp_quad_cov
real, 217 (array[] real x, real sigma, real
gamma_lpdf length_scale): matrix, 79
(reals y | reals alpha, reals beta): (array[] real x1, array[] real x2,
real, 216 real sigma, real length_scale):
gamma_lupdf matrix, 80
(reals y | reals alpha, reals beta): (vectors x, real sigma, array[] real
real, 216 length_scale): matrix, 80
gamma_p (vectors x, real sigma, real
(T1 x, T2 y): R, 31 length_scale): matrix, 80
(real a, real z): real, 30 (vectors x1, vectors x2, real sigma,
gamma_q array[] real length_scale):
(T1 x, T2 y): R, 31 matrix, 80
(real a, real z): real, 31 (vectors x1, vectors x2, real sigma,
gamma_rng real length_scale): matrix, 80
(reals alpha, reals beta): R, 217 gp_exponential_cov
gaussian_dlm_obs (array[] real x, real sigma, real
sampling statement, 245 length_scale): matrix, 81
gaussian_dlm_obs_lpdf (array[] real x1, array[] real x2,
(matrix y | matrix F, matrix G, real sigma, real length_scale):
matrix V, matrix W, vector m0, matrix, 81
matrix C0): real, 245 (vectors x, real sigma, array[] real
(matrix y | matrix F, matrix G, length_scale): matrix, 82
vector V, matrix W, vector m0, (vectors x, real sigma, real
matrix C0): real, 246 length_scale): matrix, 82
gaussian_dlm_obs_lupdf (vectors x1, vectors x2, real sigma,
(matrix y | matrix F, matrix G, array[] real length_scale):
matrix V, matrix W, vector m0, matrix, 82
matrix C0): real, 246 (vectors x1, vectors x2, real sigma,
(matrix y | matrix F, matrix G, real length_scale): matrix, 82
vector V, matrix W, vector m0, gp_matern32_cov
matrix C0): real, 246 (array[] real x, real sigma, real
generalized_inverse length_scale): matrix, 82
(matrix A): matrix, 90 (array[] real x1, array[] real x2,
274 INDEX
(reals y | reals alpha, reals beta): (real alpha, real beta): real, 29
real, 218 lchoose
inv_gamma_lupdf (T1 x, T2 y): R, 34
(reals y | reals alpha, reals beta): (real x, real y): real, 33
real, 218 ldexp
inv_gamma_rng (T1 x, T2 y): R, 35
(reals alpha, reals beta): R, 218 (real x, int y): real, 35
inv_inc_beta lgamma
(real alpha, real beta, real p): (T x): R, 30
real, 29 linspaced_array
inv_logit (int n, data real lower, data real
(T x): R, 27 upper): array[] real, 72
inv_phi linspaced_int_array
(T x): R, 28 (int n, int lower, int upper):
inv_sqrt array[] real, 72
(T x): R, 25 linspaced_row_vector
inv_square (int n, data real lower, data real
(T x): R, 25 upper): row_vector, 73
inv_wishart linspaced_vector
sampling statement, 255 (int n, data real lower, data real
inv_wishart_cholesky_lpdf upper): vector, 73
(matrix L_W | real nu, matrix L_S): lkj_corr
real, 256 sampling statement, 251
inv_wishart_lpdf lkj_corr_cholesky
(matrix W | real nu, matrix Sigma): sampling statement, 252
real, 255 lkj_corr_cholesky_lpdf
inv_wishart_lupdf (matrix L | real eta): real, 252
(matrix L_W | real nu, matrix L_S): lkj_corr_cholesky_lupdf
real, 256 (matrix L | real eta): real, 252
(matrix W | real nu, matrix Sigma): lkj_corr_cholesky_rng
real, 255 (int K, real eta): matrix, 252
inv_wishart_rng lkj_corr_lpdf
(real nu, matrix L_S): matrix, 257 (matrix y | real eta): real, 251
(real nu, matrix Sigma): matrix, 255 lkj_corr_lupdf
inverse (matrix y | real eta): real, 251
(matrix A): matrix, 89 lkj_corr_rng
inverse_spd (int K, real eta): matrix, 251
(matrix A): matrix, 89 lmgamma
is_inf (T1 x, T2 y): R, 30
(real x): int, 20 (int n, real x): real, 30
is_nan lmultiply
(real x): int, 20 (T1 x, T2 y): R, 35
lambert_w0 (real x, real y): real, 35
(T x): R, 37 log
lambert_wm1 (T x): R, 24
(T x): R, 37 (complex z): complex, 44
lbeta log10
(T1 x, T2 y): R, 29 (): real, 14
INDEX 277
(complex_row_vector x, array[] T, 76
complex_row_vector y): (complex_row_vector rv, int i, int
complex_vector, 104 n): complex_row_vector, 109
(complex_vector x, complex_vector (complex_vector v, int i, int n):
y): complex_vector, 104 complex_vector, 109
(matrix x, matrix y): vector, 65 (row_vector rv, int i, int n):
(row_vector x, row_vector y): row_vector, 76
vector, 64 (vector v, int i, int n): vector, 76
(vector x, vector y): vector, 64 sin
rows_dot_self (T x): R, 26
(complex_matrix x): complex_vector, (complex z): complex, 45
105 singular_values
(complex_row_vector x): (complex_matrix A): vector, 114
complex_vector, 104 (matrix A): vector, 93
(complex_vector x): complex_vector, sinh
104 (T x): R, 26
(matrix x): vector, 65 (complex z): complex, 46
(row_vector x): vector, 65 size
(vector x): vector, 65 (array[] T x): int, 53
scale_matrix_exp_multiply (complex_matrix x): int, 97
(real t, matrix A, matrix B): matrix, (complex_row_vector x): int, 97
88 (complex_vector x): int, 97
scaled_inv_chi_square (int x): int, 7
sampling statement, 214 (matrix x): int, 58
scaled_inv_chi_square_cdf (real x): int, 7
(reals y, reals nu, reals sigma): (row_vector x): int, 58
real, 214 (vector x): int, 57
scaled_inv_chi_square_lccdf skew_double_exponential
(reals y | reals nu, reals sigma): sampling statement, 209
real, 215 skew_double_exponential_cdf
scaled_inv_chi_square_lcdf (reals y, reals mu, reals sigma,
(reals y | reals nu, reals sigma): reals tau): real, 209
real, 215 skew_double_exponential_lccdf
scaled_inv_chi_square_lpdf (reals y | reals mu, reals sigma,
(reals y | reals nu, reals sigma): reals tau): real, 209
real, 214 skew_double_exponential_lcdf
scaled_inv_chi_square_lupdf (reals y | reals mu, reals sigma,
(reals y | reals nu, reals sigma): reals tau): real, 209
real, 214 skew_double_exponential_lpdf
scaled_inv_chi_square_rng (reals y | reals mu, reals sigma,
(reals nu, reals sigma): R, 215 reals tau): real, 209
sd skew_double_exponential_lupdf
(array[] real x): real, 50 (reals y | reals mu, reals sigma,
(matrix x): real, 70 reals tau): real, 209
(row_vector x): real, 69 skew_double_exponential_rng
(vector x): real, 69 (reals mu, reals sigma, reals tau):
segment R, 209
(array[] T sv, int i, int n): skew_normal
290 INDEX
uniform_lpdf weibull_rng
(reals y | reals alpha, reals beta): (reals alpha, reals sigma): R, 219
real, 232 wiener
uniform_lupdf sampling statement, 225
(reals y | reals alpha, reals beta): wiener_lpdf
real, 232 (reals y | reals alpha, reals tau,
uniform_rng reals beta, reals delta): real,
(reals alpha, reals beta): R, 233 225
uniform_simplex wiener_lupdf
(int n): vector, 74 (reals y | reals alpha, reals tau,
variance reals beta, reals delta): real,
(array[] real x): real, 49 226
(matrix x): real, 69 wishart
(row_vector x): real, 69 sampling statement, 253
(vector x): real, 69 wishart_cholesky_lpdf
von_mises (matrix L_W | real nu, matrix L_S):
sampling statement, 230 real, 254
von_mises_cdf wishart_lpdf
(reals y | reals mu, reals kappa): (matrix W | real nu, matrix Sigma):
real, 231 real, 253
von_mises_lcdf wishart_lupdf
(reals y | reals mu, reals kappa): (matrix L_W | real nu, matrix L_S):
real, 231 real, 254
von_mises_lpdf (matrix W | real nu, matrix Sigma):
(reals y | reals mu, reals kappa): real, 253
real, 230 wishart_rng
von_mises_lupdf (real nu, matrix L_S): matrix, 255
(reals y | reals mu, reals kappa): (real nu, matrix Sigma): matrix, 253
real, 230 zeros_array
von_mises_rng (int n): array[] real, 74
(reals mu, reals kappa): R, 231 zeros_int_array
weibull (int n): array[] int, 74
sampling statement, 219 zeros_row_vector
weibull_cdf (int n): row_vector, 74
(reals y, reals alpha, reals sigma): zeros_vector
real, 219 (int n): vector, 74
weibull_lccdf
(reals y | reals alpha, reals
sigma): real, 219
weibull_lcdf
(reals y | reals alpha, reals
sigma): real, 219
weibull_lpdf
(reals y | reals alpha, reals
sigma): real, 219
weibull_lupdf
(reals y | reals alpha, reals
sigma): real, 219