Shortcuts

torch.nn.functional

Convolution functions

conv1d

Applies a 1D convolution over an input signal composed of several input planes.

conv2d

Applies a 2D convolution over an input image composed of several input planes.

conv3d

Applies a 3D convolution over an input image composed of several input planes.

conv_transpose1d

Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called "deconvolution".

conv_transpose2d

Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution".

conv_transpose3d

Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution"

unfold

Extract sliding local blocks from a batched input tensor.

fold

Combine an array of sliding local blocks into a large containing tensor.

Pooling functions

avg_pool1d

Applies a 1D average pooling over an input signal composed of several input planes.

avg_pool2d

Applies 2D average-pooling operation in kH×kWkH \times kW regions by step size sH×sWsH \times sW steps.

avg_pool3d

Applies 3D average-pooling operation in kT×kH×kWkT \times kH \times kW regions by step size sT×sH×sWsT \times sH \times sW steps.

max_pool1d

Applies a 1D max pooling over an input signal composed of several input planes.

max_pool2d

Applies a 2D max pooling over an input signal composed of several input planes.

max_pool3d

Applies a 3D max pooling over an input signal composed of several input planes.

max_unpool1d

Compute a partial inverse of MaxPool1d.

max_unpool2d

Compute a partial inverse of MaxPool2d.

max_unpool3d

Compute a partial inverse of MaxPool3d.

lp_pool1d

Apply a 1D power-average pooling over an input signal composed of several input planes.

lp_pool2d

Apply a 2D power-average pooling over an input signal composed of several input planes.

lp_pool3d

Apply a 3D power-average pooling over an input signal composed of several input planes.

adaptive_max_pool1d

Applies a 1D adaptive max pooling over an input signal composed of several input planes.

adaptive_max_pool2d

Applies a 2D adaptive max pooling over an input signal composed of several input planes.

adaptive_max_pool3d

Applies a 3D adaptive max pooling over an input signal composed of several input planes.

adaptive_avg_pool1d

Applies a 1D adaptive average pooling over an input signal composed of several input planes.

adaptive_avg_pool2d

Apply a 2D adaptive average pooling over an input signal composed of several input planes.

adaptive_avg_pool3d

Apply a 3D adaptive average pooling over an input signal composed of several input planes.

fractional_max_pool2d

Applies 2D fractional max pooling over an input signal composed of several input planes.

fractional_max_pool3d

Applies 3D fractional max pooling over an input signal composed of several input planes.

Attention Mechanisms

The torch.nn.attention.bias module contains attention_biases that are designed to be used with scaled_dot_product_attention.

scaled_dot_product_attention

scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0,

Non-linear activation functions

threshold

Apply a threshold to each element of the input Tensor.

threshold_

In-place version of threshold().

relu

Applies the rectified linear unit function element-wise.

relu_

In-place version of relu().

hardtanh

Applies the HardTanh function element-wise.

hardtanh_

In-place version of hardtanh().

hardswish

Apply hardswish function, element-wise.

relu6

Applies the element-wise function ReLU6(x)=min(max(0,x),6)\text{ReLU6}(x) = \min(\max(0,x), 6).

elu

Apply the Exponential Linear Unit (ELU) function element-wise.

elu_

In-place version of elu().

selu

Applies element-wise, SELU(x)=scale(max(0,x)+min(0,α(exp(x)1)))\text{SELU}(x) = scale * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1))), with α=1.6732632423543772848170429916717\alpha=1.6732632423543772848170429916717 and scale=1.0507009873554804934193349852946scale=1.0507009873554804934193349852946.

celu

Applies element-wise, CELU(x)=max(0,x)+min(0,α(exp(x/α)1))\text{CELU}(x) = \max(0,x) + \min(0, \alpha * (\exp(x/\alpha) - 1)).

leaky_relu

Applies element-wise, LeakyReLU(x)=max(0,x)+negative_slopemin(0,x)\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x)

leaky_relu_

In-place version of leaky_relu().

prelu

Applies element-wise the function PReLU(x)=max(0,x)+weightmin(0,x)\text{PReLU}(x) = \max(0,x) + \text{weight} * \min(0,x) where weight is a learnable parameter.

rrelu

Randomized leaky ReLU.

rrelu_

In-place version of rrelu().

glu

The gated linear unit.

gelu

When the approximate argument is 'none', it applies element-wise the function GELU(x)=xΦ(x)\text{GELU}(x) = x * \Phi(x)

logsigmoid

Applies element-wise LogSigmoid(xi)=log(11+exp(xi))\text{LogSigmoid}(x_i) = \log \left(\frac{1}{1 + \exp(-x_i)}\right)

hardshrink

Applies the hard shrinkage function element-wise

tanhshrink

Applies element-wise, Tanhshrink(x)=xTanh(x)\text{Tanhshrink}(x) = x - \text{Tanh}(x)

softsign

Applies element-wise, the function SoftSign(x)=x1+x\text{SoftSign}(x) = \frac{x}{1 + |x|}

softplus

Applies element-wise, the function Softplus(x)=1βlog(1+exp(βx))\text{Softplus}(x) = \frac{1}{\beta} * \log(1 + \exp(\beta * x)).

softmin

Apply a softmin function.

softmax

Apply a softmax function.

softshrink

Applies the soft shrinkage function elementwise

gumbel_softmax

Sample from the Gumbel-Softmax distribution (Link 1 Link 2) and optionally discretize.

log_softmax

Apply a softmax followed by a logarithm.

tanh

Applies element-wise, Tanh(x)=tanh(x)=exp(x)exp(x)exp(x)+exp(x)\text{Tanh}(x) = \tanh(x) = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)}

sigmoid

Applies the element-wise function Sigmoid(x)=11+exp(x)\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}

hardsigmoid

Apply the Hardsigmoid function element-wise.

silu

Apply the Sigmoid Linear Unit (SiLU) function, element-wise.

mish

Apply the Mish function, element-wise.

batch_norm

Apply Batch Normalization for each channel across a batch of data.

group_norm

Apply Group Normalization for last certain number of dimensions.

instance_norm

Apply Instance Normalization independently for each channel in every data sample within a batch.

layer_norm

Apply Layer Normalization for last certain number of dimensions.

local_response_norm

Apply local response normalization over an input signal.

rms_norm

Apply Root Mean Square Layer Normalization.

normalize

Perform LpL_p normalization of inputs over specified dimension.

Linear functions

linear

Applies a linear transformation to the incoming data: y=xAT+by = xA^T + b.

bilinear

Applies a bilinear transformation to the incoming data: y=x1TAx2+by = x_1^T A x_2 + b

Dropout functions

dropout

During training, randomly zeroes some elements of the input tensor with probability p.

alpha_dropout

Apply alpha dropout to the input.

feature_alpha_dropout

Randomly masks out entire channels (a channel is a feature map).

dropout1d

Randomly zero out entire channels (a channel is a 1D feature map).

dropout2d

Randomly zero out entire channels (a channel is a 2D feature map).

dropout3d

Randomly zero out entire channels (a channel is a 3D feature map).

Sparse functions

embedding

Generate a simple lookup table that looks up embeddings in a fixed dictionary and size.

embedding_bag

Compute sums, means or maxes of bags of embeddings.

one_hot

Takes LongTensor with index values of shape (*) and returns a tensor of shape (*, num_classes) that have zeros everywhere except where the index of last dimension matches the corresponding value of the input tensor, in which case it will be 1.

Distance functions

pairwise_distance

See torch.nn.PairwiseDistance for details

cosine_similarity

Returns cosine similarity between x1 and x2, computed along dim.

pdist

Computes the p-norm distance between every pair of row vectors in the input.

Loss functions

binary_cross_entropy

Measure Binary Cross Entropy between the target and input probabilities.

binary_cross_entropy_with_logits

Calculate Binary Cross Entropy between target and input logits.

poisson_nll_loss

Poisson negative log likelihood loss.

cosine_embedding_loss

See CosineEmbeddingLoss for details.

cross_entropy

Compute the cross entropy loss between input logits and target.

ctc_loss

Apply the Connectionist Temporal Classification loss.

gaussian_nll_loss

Gaussian negative log likelihood loss.

hinge_embedding_loss

See HingeEmbeddingLoss for details.

kl_div

Compute the KL Divergence loss.

l1_loss

Function that takes the mean element-wise absolute value difference.

mse_loss

Measures the element-wise mean squared error, with optional weighting.

margin_ranking_loss

See MarginRankingLoss for details.

multilabel_margin_loss

See MultiLabelMarginLoss for details.

multilabel_soft_margin_loss

See MultiLabelSoftMarginLoss for details.

multi_margin_loss

See MultiMarginLoss for details.

nll_loss

Compute the negative log likelihood loss.

huber_loss

Computes the Huber loss, with optional weighting.

smooth_l1_loss

Compute the Smooth L1 loss.

soft_margin_loss

See SoftMarginLoss for details.

triplet_margin_loss

Compute the triplet loss between given input tensors and a margin greater than 0.

triplet_margin_with_distance_loss

Compute the triplet margin loss for input tensors using a custom distance function.

Vision functions

pixel_shuffle

Rearranges elements in a tensor of shape (,C×r2,H,W)(*, C \times r^2, H, W) to a tensor of shape (,C,H×r,W×r)(*, C, H \times r, W \times r), where r is the upscale_factor.

pixel_unshuffle

Reverses the PixelShuffle operation by rearranging elements in a tensor of shape (,C,H×r,W×r)(*, C, H \times r, W \times r) to a tensor of shape (,C×r2,H,W)(*, C \times r^2, H, W), where r is the downscale_factor.

pad

Pads tensor.

interpolate

Down/up samples the input.

upsample

Upsample input.

upsample_nearest

Upsamples the input, using nearest neighbours' pixel values.

upsample_bilinear

Upsamples the input, using bilinear upsampling.

grid_sample

Compute grid sample.

affine_grid

Generate 2D or 3D flow field (sampling grid), given a batch of affine matrices theta.

DataParallel functions (multi-GPU, distributed)

data_parallel

torch.nn.parallel.data_parallel

Evaluate module(input) in parallel across the GPUs given in device_ids.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy