Skip to content

Built-In Layers

Index

Containers

# Lux.BranchLayerType.
julia
BranchLayer(layers...)
BranchLayer(; name=nothing, layers...)

Takes an input x and passes it through all the layers and returns a tuple of the outputs.

Arguments

  • Layers can be specified in two formats:
    • A list of N Lux layers

    • Specified as N keyword arguments.

Keyword Arguments

  • name: Name of the layer (optional)

Inputs

  • x: Will be directly passed to each of the layers

Returns

  • Tuple: (layer_1(x), layer_2(x), ..., layer_N(x)) (naming changes if using the kwargs API)

  • Updated state of the layers

Parameters

  • Parameters of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

States

  • States of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

Comparison with Parallel

This is slightly different from Parallel(nothing, layers...)

  • If the input is a tuple, Parallel will pass each element individually to each layer.

  • BranchLayer essentially assumes 1 input comes in and is branched out into N outputs.

Example

An easy way to replicate an input to an NTuple is to do

julia
julia> BranchLayer(NoOpLayer(), NoOpLayer(), NoOpLayer())
BranchLayer(
    layer_1 = NoOpLayer(),
    layer_2 = NoOpLayer(),
    layer_3 = NoOpLayer(),
)         # Total: 0 parameters,
          #        plus 0 states.

source


# Lux.ChainType.
julia
Chain(layers...; name=nothing, disable_optimizations::Bool = false)
Chain(; layers..., name=nothing, disable_optimizations::Bool = false)

Collects multiple layers / functions to be called in sequence on a given input.

Arguments

  • Layers can be specified in two formats:
    • A list of N Lux layers

    • Specified as N keyword arguments.

Keyword Arguments

  • disable_optimizations: Prevents any structural optimization

  • name: Name of the layer (optional)

Inputs

Input x is passed sequentially to each layer, and must conform to the input requirements of the internal layers.

Returns

  • Output after sequentially applying all the layers to x

  • Updated model states

Parameters

  • Parameters of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

States

  • States of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

Optimizations

Performs a few optimizations to generate reasonable architectures. Can be disabled using keyword argument disable_optimizations.

  • All sublayers are recursively optimized.

  • If a function f is passed as a layer and it doesn't take 3 inputs, it is converted to a WrappedFunction(f) which takes only one input.

  • If the layer is a Chain, it is flattened.

  • NoOpLayers are removed.

  • If there is only 1 layer (left after optimizations), then it is returned without the Chain wrapper.

  • If there are no layers (left after optimizations), a NoOpLayer is returned.

Miscellaneous Properties

  • Allows indexing. We can access the ith layer using m[i]. We can also index using ranges or arrays.

Example

julia
julia> Chain(Dense(2, 3, relu), BatchNorm(3), Dense(3, 2))
Chain(
    layer_1 = Dense(2 => 3, relu),      # 9 parameters
    layer_2 = BatchNorm(3, affine=true, track_stats=true),  # 6 parameters, plus 7
    layer_3 = Dense(3 => 2),            # 8 parameters
)         # Total: 23 parameters,
          #        plus 7 states.

source


# Lux.PairwiseFusionType.
julia
PairwiseFusion(connection, layers...; name=nothing)
PairwiseFusion(connection; name=nothing, layers...)
x1 → layer1 → y1 ↘
                  connection → layer2 → y2 ↘
              x2 ↗                          connection → y3
                                        x3 ↗

Arguments

  • connection: Takes 2 inputs and combines them

  • layers: AbstractExplicitLayers. Layers can be specified in two formats:

    • A list of N Lux layers

    • Specified as N keyword arguments.

Keyword Arguments

  • name: Name of the layer (optional)

Inputs

Layer behaves differently based on input type:

  1. If the input x is a tuple of length N + 1, then the layers must be a tuple of length N. The computation is as follows
julia
y = x[1]
for i in 1:N
    y = connection(x[i + 1], layers[i](y))
end
  1. Any other kind of input
julia
y = x
for i in 1:N
    y = connection(x, layers[i](y))
end

Returns

  • See Inputs section for how the return value is computed

  • Updated model state for all the contained layers

Parameters

  • Parameters of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

States

  • States of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

source


# Lux.ParallelType.
julia
Parallel(connection, layers...; name=nothing)
Parallel(connection; name=nothing, layers...)

Create a layer which passes an input to each path in layers, before reducing the output with connection.

Arguments

  • connection: An N-argument function that is called after passing the input through each layer. If connection = nothing, we return a tuple Parallel(nothing, f, g)(x, y) = (f(x), g(y))

  • Layers can be specified in two formats:

    • A list of N Lux layers

    • Specified as N keyword arguments.

Keyword Arguments

  • name: Name of the layer (optional)

Inputs

  • x: If x is not a tuple, then return is computed as connection([l(x) for l in layers]...). Else one is passed to each layer, thus Parallel(+, f, g)(x, y) = f(x) + g(y).

Returns

  • See the Inputs section for how the output is computed

  • Updated state of the layers

Parameters

  • Parameters of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

States

  • States of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

See also SkipConnection which is Parallel with one identity.

source


# Lux.SkipConnectionType.
julia
SkipConnection(layer, connection; name=nothing)

Create a skip connection which consists of a layer or Chain of consecutive layers and a shortcut connection linking the block's input to the output through a user-supplied 2-argument callable. The first argument to the callable will be propagated through the given layer while the second is the unchanged, "skipped" input.

The simplest "ResNet"-type connection is just SkipConnection(layer, +).

Arguments

  • layer: Layer or Chain of layers to be applied to the input

  • connection:

    • A 2-argument function that takes layer(input) and the input OR

    • An AbstractExplicitLayer that takes (layer(input), input) as input

Keyword Arguments

  • name: Name of the layer (optional)

Inputs

  • x: Will be passed directly to layer

Returns

  • Output of connection(layer(input), input)

  • Updated state of layer

Parameters

  • Parameters of layer OR

  • If connection is an AbstractExplicitLayer, then NamedTuple with fields :layers and :connection

States

  • States of layer OR

  • If connection is an AbstractExplicitLayer, then NamedTuple with fields :layers and :connection

See Parallel for a more general implementation.

source


# Lux.RepeatedLayerType.
julia
RepeatedLayer(model; repeats::Val = Val(10), input_injection::Val = Val(false))

Iteratively applies model for repeats number of times. The initial input is passed into the model repeatedly if input_injection = Val(true). This layer unrolls the computation, however, semantically this is same as:

  1. input_injection = Val(false)
julia
res = x
for i in 1:repeats
    res, st = model(res, ps, st)
end
  1. input_injection = Val(true)
julia
res = x
for i in 1:repeats
    res, st = model((res, x), ps, st)
end

It is expected that repeats will be a reasonable number below 20, beyond that compile times for gradients might be unreasonably high.

Arguments

  • model must be an AbstractExplicitLayer

Keyword Arguments

  • repeats: Number of times to apply the model

  • input_injection: If true, then the input is passed to the model along with the output

Inputs

  • x: Input as described above

Returns

  • Output is computed by as described above

  • Updated state of the model

Parameters

  • Parameters of model

States

  • State of model

source


Convolutional Layers

# Lux.ConvType.
julia
Conv(k::NTuple{N,Integer}, (in_chs => out_chs)::Pair{<:Integer,<:Integer},
     activation=identity; init_weight=glorot_uniform, init_bias=zeros32, stride=1,
     pad=0, dilation=1, groups=1, use_bias=true)

Standard convolutional layer.

Image data should be stored in WHCN order (width, height, channels, batch). In other words, a 100 x 100 RGB image would be a 100 x 100 x 3 x 1 array, and a batch of 50 would be a 100 x 100 x 3 x 50 array. This has N = 2 spatial dimensions, and needs a kernel size like (5, 5), a 2-tuple of integers. To take convolutions along N feature dimensions, this layer expects as input an array with ndims(x) == N + 2, where size(x, N + 1) == in_chs is the number of input channels, and size(x, ndims(x)) is the number of observations in a batch.

Warning

Frameworks like Pytorch perform cross-correlation in their convolution layers

Arguments

  • k: Tuple of integers specifying the size of the convolutional kernel. Eg, for 2D convolutions length(k) == 2

  • in_chs: Number of input channels

  • out_chs: Number of input and output channels

  • activation: Activation Function

Keyword Arguments

  • init_weight: Controls the initialization of the weight parameter

  • init_bias: Controls the initialization of the bias parameter

  • stride: Should each be either single integer, or a tuple with N integers

  • dilation: Should each be either single integer, or a tuple with N integers

  • pad: Specifies the number of elements added to the borders of the data array. It can be

    • a single integer for equal padding all around,

    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,

    • a tuple of 2*N integers, for asymmetric padding, or

    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) / stride (possibly rounded) for each spatial dimension.

    • Periodic padding can achieved by pre-empting the layer with a WrappedFunction(x -> NNlib.circular_pad(x, N_pad; dims=pad_dims))

  • groups: Expected to be an Int. It specifies the number of groups to divide a convolution into (set groups = in_chs for Depthwise Convolutions). in_chs and out_chs must be divisible by groups.

  • use_bias: Trainable bias can be disabled entirely by setting this to false.

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Inputs

  • x: Data satisfying ndims(x) == N + 2 && size(x, N - 1) == in_chs, i.e. size(x) = (I_N, ..., I_1, C_in, N)

Returns

  • Output of the convolution y of size (O_N, ..., O_1, C_out, N) where
Oi=Ii+pi+p(i+N)%|p|di×(ki1)si+1
  • Empty NamedTuple()

Parameters

  • weight: Convolution kernel

  • bias: Bias (present if use_bias=true)

source


# Lux.ConvTransposeType.
julia
ConvTranspose(k::NTuple{N,Integer}, (in_chs => out_chs)::Pair{<:Integer,<:Integer},
              activation=identity; init_weight=glorot_uniform, init_bias=zeros32,
              stride=1, pad=0, dilation=1, groups=1, use_bias=true)

Standard convolutional transpose layer.

Arguments

  • k: Tuple of integers specifying the size of the convolutional kernel. Eg, for 2D convolutions length(k) == 2

  • in_chs: Number of input channels

  • out_chs: Number of input and output channels

  • activation: Activation Function

Keyword Arguments

  • init_weight: Controls the initialization of the weight parameter

  • init_bias: Controls the initialization of the bias parameter

  • stride: Should each be either single integer, or a tuple with N integers

  • dilation: Should each be either single integer, or a tuple with N integers

  • pad: Specifies the number of elements added to the borders of the data array. It can be

    • a single integer for equal padding all around,

    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,

    • a tuple of 2*N integers, for asymmetric padding, or

    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) * stride (possibly rounded) for each spatial dimension.

  • groups: Expected to be an Int. It specifies the number of groups to divide a convolution into (set groups = in_chs for Depthwise Convolutions). in_chs and out_chs must be divisible by groups.

  • use_bias: Trainable bias can be disabled entirely by setting this to false.

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Inputs

  • x: Data satisfying ndims(x) == N + 2 && size(x, N - 1) == in_chs, i.e. size(x) = (I_N, ..., I_1, C_in, N)

Returns

  • Output of the convolution transpose y of size (O_N, ..., O_1, C_out, N) where

  • Empty NamedTuple()

Parameters

  • weight: Convolution Transpose kernel

  • bias: Bias (present if use_bias=true)

source


# Lux.CrossCorType.
julia
CrossCor(k::NTuple{N,Integer}, (in_chs => out_chs)::Pair{<:Integer,<:Integer},
         activation=identity; init_weight=glorot_uniform, init_bias=zeros32, stride=1,
         pad=0, dilation=1, use_bias=true)

Cross Correlation layer.

Image data should be stored in WHCN order (width, height, channels, batch). In other words, a 100 x 100 RGB image would be a 100 x 100 x 3 x 1 array, and a batch of 50 would be a 100 x 100 x 3 x 50 array. This has N = 2 spatial dimensions, and needs a kernel size like (5, 5), a 2-tuple of integers. To take convolutions along N feature dimensions, this layer expects as input an array with ndims(x) == N + 2, where size(x, N + 1) == in_chs is the number of input channels, and size(x, ndims(x)) is the number of observations in a batch.

Arguments

  • k: Tuple of integers specifying the size of the convolutional kernel. Eg, for 2D convolutions length(k) == 2

  • in_chs: Number of input channels

  • out_chs: Number of input and output channels

  • activation: Activation Function

Keyword Arguments

  • init_weight: Controls the initialization of the weight parameter

  • init_bias: Controls the initialization of the bias parameter

  • stride: Should each be either single integer, or a tuple with N integers

  • dilation: Should each be either single integer, or a tuple with N integers

  • pad: Specifies the number of elements added to the borders of the data array. It can be

    • a single integer for equal padding all around,

    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,

    • a tuple of 2*N integers, for asymmetric padding, or

    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) / stride (possibly rounded) for each spatial dimension.

  • use_bias: Trainable bias can be disabled entirely by setting this to false.

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Inputs

  • x: Data satisfying ndims(x) == N + 2 && size(x, N - 1) == in_chs, i.e. size(x) = (I_N, ..., I_1, C_in, N)

Returns

  • Output of the convolution y of size (O_N, ..., O_1, C_out, N) where
Oi=Ii+pi+p(i+N)%|p|di×(ki1)si+1
  • Empty NamedTuple()

Parameters

  • weight: Convolution kernel

  • bias: Bias (present if use_bias=true)

source


Dropout Layers

# Lux.AlphaDropoutType.
julia
AlphaDropout(p::Real)

AlphaDropout layer.

Arguments

  • p: Probability of Dropout
    • if p = 0 then NoOpLayer is returned.

    • if p = 1 then WrappedLayer(Base.Fix1(broadcast, zero)) is returned.

Inputs

  • x: Must be an AbstractArray

Returns

  • x with dropout mask applied if training=Val(true) else just x

  • State with updated rng

States

  • rng: Pseudo Random Number Generator

  • training: Used to check if training/inference mode

Call Lux.testmode to switch to test mode.

See also Dropout, VariationalHiddenDropout

source


# Lux.DropoutType.
julia
Dropout(p; dims=:)

Dropout layer.

Arguments

  • p: Probability of Dropout (if p = 0 then NoOpLayer is returned)

Keyword Arguments

  • To apply dropout along certain dimension(s), specify the dims keyword. e.g. Dropout(p; dims = 3) will randomly zero out entire channels on WHCN input (also called 2D dropout).

Inputs

  • x: Must be an AbstractArray

Returns

  • x with dropout mask applied if training=Val(true) else just x

  • State with updated rng

States

  • rng: Pseudo Random Number Generator

  • training: Used to check if training/inference mode

Call Lux.testmode to switch to test mode.

See also AlphaDropout, VariationalHiddenDropout

source


# Lux.VariationalHiddenDropoutType.
julia
VariationalHiddenDropout(p; dims=:)

VariationalHiddenDropout layer. The only difference from Dropout is that the mask is retained until Lux.update_state(l, :update_mask, Val(true)) is called.

Arguments

  • p: Probability of Dropout (if p = 0 then NoOpLayer is returned)

Keyword Arguments

  • To apply dropout along certain dimension(s), specify the dims keyword. e.g. VariationalHiddenDropout(p; dims = 3) will randomly zero out entire channels on WHCN input (also called 2D dropout).

Inputs

  • x: Must be an AbstractArray

Returns

  • x with dropout mask applied if training=Val(true) else just x

  • State with updated rng

States

  • rng: Pseudo Random Number Generator

  • training: Used to check if training/inference mode

  • mask: Dropout mask. Initilly set to nothing. After every run, contains the mask applied in that call

  • update_mask: Stores whether new mask needs to be generated in the current call

Call Lux.testmode to switch to test mode.

See also AlphaDropout, Dropout

source


Pooling Layers

# Lux.AdaptiveMaxPoolType.
julia
AdaptiveMaxPool(out::NTuple)

Adaptive Max Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out.

Arguments

  • out: Size of the first N dimensions for the output

Inputs

  • x: Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

Returns

  • Output of size (out..., C, N)

  • Empty NamedTuple()

See also MaxPool, AdaptiveMeanPool.

source


# Lux.AdaptiveMeanPoolType.
julia
AdaptiveMeanPool(out::NTuple)

Adaptive Mean Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out.

Arguments

  • out: Size of the first N dimensions for the output

Inputs

  • x: Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

Returns

  • Output of size (out..., C, N)

  • Empty NamedTuple()

See also MeanPool, AdaptiveMaxPool.

source


# Lux.GlobalMaxPoolType.
julia
GlobalMaxPool()

Global Max Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing max pooling on the complete (w,h)-shaped feature maps.

Inputs

  • x: Data satisfying ndims(x) > 2, i.e. size(x) = (I_N, ..., I_1, C, N)

Returns

  • Output of the pooling y of size (1, ..., 1, C, N)

  • Empty NamedTuple()

See also MaxPool, AdaptiveMaxPool, GlobalMeanPool

source


# Lux.GlobalMeanPoolType.
julia
GlobalMeanPool()

Global Mean Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing mean pooling on the complete (w,h)-shaped feature maps.

Inputs

  • x: Data satisfying ndims(x) > 2, i.e. size(x) = (I_N, ..., I_1, C, N)

Returns

  • Output of the pooling y of size (1, ..., 1, C, N)

  • Empty NamedTuple()

See also MeanPool, AdaptiveMeanPool, GlobalMaxPool

source


# Lux.MaxPoolType.
julia
MaxPool(window::NTuple; pad=0, stride=window)

Max pooling layer, which replaces all pixels in a block of size window with the maximum value.

Arguments

  • window: Tuple of integers specifying the size of the window. Eg, for 2D pooling length(window) == 2

Keyword Arguments

  • stride: Should each be either single integer, or a tuple with N integers

  • pad: Specifies the number of elements added to the borders of the data array. It can be

    • a single integer for equal padding all around,

    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,

    • a tuple of 2*N integers, for asymmetric padding, or

    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) / stride (possibly rounded) for each spatial dimension.

Inputs

  • x: Data satisfying ndims(x) == N + 2, i.e. size(x) = (I_N, ..., I_1, C, N)

Returns

  • Output of the pooling y of size (O_N, ..., O_1, C, N) where
Oi=Ii+pi+p(i+N)%|p|di×(ki1)si+1
  • Empty NamedTuple()

See also Conv, MeanPool, GlobalMaxPool, AdaptiveMaxPool

source


# Lux.MeanPoolType.
julia
MeanPool(window::NTuple; pad=0, stride=window)

Mean pooling layer, which replaces all pixels in a block of size window with the mean value.

Arguments

  • window: Tuple of integers specifying the size of the window. Eg, for 2D pooling length(window) == 2

Keyword Arguments

  • stride: Should each be either single integer, or a tuple with N integers

  • pad: Specifies the number of elements added to the borders of the data array. It can be

    • a single integer for equal padding all around,

    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,

    • a tuple of 2*N integers, for asymmetric padding, or

    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) / stride (possibly rounded) for each spatial dimension.

Inputs

  • x: Data satisfying ndims(x) == N + 2, i.e. size(x) = (I_N, ..., I_1, C, N)

Returns

  • Output of the pooling y of size (O_N, ..., O_1, C, N) where
Oi=Ii+pi+p(i+N)%|p|di×(ki1)si+1
  • Empty NamedTuple()

See also Conv, MaxPool, GlobalMeanPool, AdaptiveMeanPool

source


Recurrent Layers

# Lux.GRUCellType.
julia
GRUCell((in_dims, out_dims)::Pair{<:Int,<:Int}; use_bias=true, train_state::Bool=false,
        init_weight::Tuple{Function,Function,Function}=(glorot_uniform, glorot_uniform,
                                                        glorot_uniform),
        init_bias::Tuple{Function,Function,Function}=(zeros32, zeros32, zeros32),
        init_state::Function=zeros32)

Gated Recurrent Unit (GRU) Cell

r=σ(Wir×x+Whr×hprev+bhr)z=σ(Wiz×x+Whz×hprev+bhz)n=tanh(Win×x+bin+r(Whn×hprev+bhn))hnew=(1z)n+zhprev

Arguments

  • in_dims: Input Dimension

  • out_dims: Output (Hidden State) Dimension

  • use_bias: Set to false to deactivate bias

  • train_state: Trainable initial hidden state can be activated by setting this to true

  • init_bias: Initializer for bias. Must be a tuple containing 3 functions

  • init_weight: Initializer for weight. Must be a tuple containing 3 functions

  • init_state: Initializer for hidden state

Inputs

  • Case 1a: Only a single input x of shape (in_dims, batch_size), train_state is set to false - Creates a hidden state using init_state and proceeds to Case 2.

  • Case 1b: Only a single input x of shape (in_dims, batch_size), train_state is set to true - Repeats hidden_state from parameters to match the shape of x and proceeds to Case 2.

  • Case 2: Tuple (x, (h, )) is provided, then the output and a tuple containing the updated hidden state is returned.

Returns

  • Tuple containing

    • Output hnew of shape (out_dims, batch_size)

    • Tuple containing new hidden state hnew

  • Updated model state

Parameters

  • weight_i: Concatenated Weights to map from input space {Wir,Wiz,Win}.

  • weight_h: Concatenated Weights to map from hidden space {Whr,Whz,Whn}.

  • bias_i: Bias vector (bin; not present if use_bias=false).

  • bias_h: Concatenated Bias vector for the hidden space {bhr,bhz,bhn} (not present if use_bias=false).

  • hidden_state: Initial hidden state vector (not present if train_state=false) {bhr,bhz,bhn}.

States

  • rng: Controls the randomness (if any) in the initial state generation

source


# Lux.LSTMCellType.
julia
LSTMCell(in_dims => out_dims; use_bias::Bool=true, train_state::Bool=false,
         train_memory::Bool=false,
         init_weight=(glorot_uniform, glorot_uniform, glorot_uniform, glorot_uniform),
         init_bias=(zeros32, zeros32, ones32, zeros32), init_state=zeros32,
         init_memory=zeros32)

Long Short-Term (LSTM) Cell

i=σ(Wii×x+Whi×hprev+bi)f=σ(Wif×x+Whf×hprev+bf)g=tanh(Wig×x+Whg×hprev+bg)o=σ(Wio×x+Who×hprev+bo)cnew=fcprev+ighnew=otanh(cnew)

Arguments

  • in_dims: Input Dimension

  • out_dims: Output (Hidden State & Memory) Dimension

  • use_bias: Set to false to deactivate bias

  • train_state: Trainable initial hidden state can be activated by setting this to true

  • train_memory: Trainable initial memory can be activated by setting this to true

  • init_bias: Initializer for bias. Must be a tuple containing 4 functions

  • init_weight: Initializer for weight. Must be a tuple containing 4 functions

  • init_state: Initializer for hidden state

  • init_memory: Initializer for memory

Inputs

  • Case 1a: Only a single input x of shape (in_dims, batch_size), train_state is set to false, train_memory is set to false - Creates a hidden state using init_state, hidden memory using init_memory and proceeds to Case 2.

  • Case 1b: Only a single input x of shape (in_dims, batch_size), train_state is set to true, train_memory is set to false - Repeats hidden_state vector from the parameters to match the shape of x, creates hidden memory using init_memory and proceeds to Case 2.

  • Case 1c: Only a single input x of shape (in_dims, batch_size), train_state is set to false, train_memory is set to true - Creates a hidden state using init_state, repeats the memory vector from parameters to match the shape of x and proceeds to Case 2.

  • Case 1d: Only a single input x of shape (in_dims, batch_size), train_state is set to true, train_memory is set to true - Repeats the hidden state and memory vectors from the parameters to match the shape of x and proceeds to Case 2.

  • Case 2: Tuple (x, (h, c)) is provided, then the output and a tuple containing the updated hidden state and memory is returned.

Returns

  • Tuple Containing

    • Output hnew of shape (out_dims, batch_size)

    • Tuple containing new hidden state hnew and new memory cnew

  • Updated model state

Parameters

  • weight_i: Concatenated Weights to map from input space {Wii,Wif,Wig,Wio}.

  • weight_h: Concatenated Weights to map from hidden space {Whi,Whf,Whg,Who}

  • bias: Bias vector (not present if use_bias=false)

  • hidden_state: Initial hidden state vector (not present if train_state=false)

  • memory: Initial memory vector (not present if train_memory=false)

States

  • rng: Controls the randomness (if any) in the initial state generation

source


# Lux.RNNCellType.
julia
RNNCell(in_dims => out_dims, activation=tanh; bias::Bool=true,
        train_state::Bool=false, init_bias=zeros32, init_weight=glorot_uniform,
        init_state=ones32)

An Elman RNNCell cell with activation (typically set to tanh or relu).

hnew=activation(weightih×x+weighthh×hprev+bias)

Arguments

  • in_dims: Input Dimension

  • out_dims: Output (Hidden State) Dimension

  • activation: Activation function

  • bias: Set to false to deactivate bias

  • train_state: Trainable initial hidden state can be activated by setting this to true

  • init_bias: Initializer for bias

  • init_weight: Initializer for weight

  • init_state: Initializer for hidden state

Inputs

  • Case 1a: Only a single input x of shape (in_dims, batch_size), train_state is set to false - Creates a hidden state using init_state and proceeds to Case 2.

  • Case 1b: Only a single input x of shape (in_dims, batch_size), train_state is set to true - Repeats hidden_state from parameters to match the shape of x and proceeds to Case 2.

  • Case 2: Tuple (x, (h, )) is provided, then the output and a tuple containing the updated hidden state is returned.

Returns

  • Tuple containing

    • Output hnew of shape (out_dims, batch_size)

    • Tuple containing new hidden state hnew

  • Updated model state

Parameters

  • weight_ih: Maps the input to the hidden state.

  • weight_hh: Maps the hidden state to the hidden state.

  • bias: Bias vector (not present if use_bias=false)

  • hidden_state: Initial hidden state vector (not present if train_state=false)

States

  • rng: Controls the randomness (if any) in the initial state generation

source


# Lux.RecurrenceType.
julia
Recurrence(cell;
    ordering::AbstractTimeSeriesDataBatchOrdering=BatchLastIndex(),
    return_sequence::Bool=false)

Wraps a recurrent cell (like RNNCell, LSTMCell, GRUCell) to automatically operate over a sequence of inputs.

Warning

This is completely distinct from Flux.Recur. It doesn't make the cell stateful, rather allows operating on an entire sequence of inputs at once. See StatefulRecurrentCell for functionality similar to Flux.Recur.

Arguments

  • cell: A recurrent cell. See RNNCell, LSTMCell, GRUCell, for how the inputs/outputs of a recurrent cell must be structured.

Keyword Arguments

  • return_sequence: If true returns the entire sequence of outputs, else returns only the last output. Defaults to false.

  • ordering: The ordering of the batch and time dimensions in the input. Defaults to BatchLastIndex(). Alternatively can be set to TimeLastIndex().

Inputs

  • If x is a
    • Tuple or Vector: Each element is fed to the cell sequentially.

    • Array (except a Vector): It is spliced along the penultimate dimension and each slice is fed to the cell sequentially.

Returns

  • Output of the cell for the entire sequence.

  • Update state of the cell.

Parameters

  • Same as cell.

States

  • Same as cell.

Tip

Frameworks like Tensorflow have special implementation of MultiRNNCell to handle sequentially composed RNN Cells. In Lux, one can simple stack multiple Recurrence blocks in a Chain to achieve the same.

Chain(
    Recurrence(RNNCell(inputsize => latentsize); return_sequence=true),
    Recurrence(RNNCell(latentsize => latentsize); return_sequence=true),
    :
    x -> stack(x; dims=2)
)

For some discussion on this topic, see https://github.com/LuxDL/Lux.jl/issues/472.

source


# Lux.StatefulRecurrentCellType.
julia
StatefulRecurrentCell(cell)

Wraps a recurrent cell (like RNNCell, LSTMCell, GRUCell) and makes it stateful.

Tip

This is very similar to Flux.Recur

To avoid undefined behavior, once the processing of a single sequence of data is complete, update the state with Lux.update_state(st, :carry, nothing).

Arguments

  • cell: A recurrent cell. See RNNCell, LSTMCell, GRUCell, for how the inputs/outputs of a recurrent cell must be structured.

Inputs

  • Input to the cell.

Returns

  • Output of the cell for the entire sequence.

  • Update state of the cell and updated carry.

Parameters

  • Same as cell.

States

  • NamedTuple containing:
    • cell: Same as cell.

    • carry: The carry state of the cell.

source


Linear Layers

# Lux.BilinearType.
julia
Bilinear((in1_dims, in2_dims) => out, activation=identity; init_weight=glorot_uniform,
         init_bias=zeros32, use_bias::Bool=true, allow_fast_activation::Bool=true)
Bilinear(in12_dims => out, activation=identity; init_weight=glorot_uniform,
         init_bias=zeros32, use_bias::Bool=true, allow_fast_activation::Bool=true)

Create a fully connected layer between two inputs and an output, and otherwise similar to Dense. Its output, given vectors x & y, is another vector z with, for all i in 1:out:

z[i] = activation(x' * W[i, :, :] * y + bias[i])

If x and y are matrices, then each column of the output z = B(x, y) is of this form, with B the Bilinear layer.

Arguments

  • in1_dims: number of input dimensions of x

  • in2_dims: number of input dimensions of y

  • in12_dims: If specified, then in1_dims = in2_dims = in12_dims

  • out: number of output dimensions

  • activation: activation function

Keyword Arguments

  • init_weight: initializer for the weight matrix (weight = init_weight(rng, out_dims, in1_dims, in2_dims))

  • init_bias: initializer for the bias vector (ignored if use_bias=false)

  • use_bias: Trainable bias can be disabled entirely by setting this to false

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Input

  • A 2-Tuple containing

    • x must be an AbstractArray with size(x, 1) == in1_dims

    • y must be an AbstractArray with size(y, 1) == in2_dims

  • If the input is an AbstractArray, then x = y

Returns

  • AbstractArray with dimensions (out_dims, size(x, 2))

  • Empty NamedTuple()

Parameters

  • weight: Weight Matrix of size (out_dims, in1_dims, in2_dims)

  • bias: Bias of size (out_dims, 1) (present if use_bias=true)

source


# Lux.DenseType.
julia
Dense(in_dims => out_dims, activation=identity; init_weight=glorot_uniform,
      init_bias=zeros32, use_bias::Bool=true, allow_fast_activation::Bool=true)

Create a traditional fully connected layer, whose forward pass is given by: y = activation.(weight * x .+ bias)

Arguments

  • in_dims: number of input dimensions

  • out_dims: number of output dimensions

  • activation: activation function

Keyword Arguments

  • init_weight: initializer for the weight matrix (weight = init_weight(rng, out_dims, in_dims))

  • init_bias: initializer for the bias vector (ignored if use_bias=false)

  • use_bias: Trainable bias can be disabled entirely by setting this to false

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Input

  • x must be an AbstractArray with size(x, 1) == in_dims

Returns

  • AbstractArray with dimensions (out_dims, ...) where ... are the dimensions of x

  • Empty NamedTuple()

Parameters

  • weight: Weight Matrix of size (out_dims, in_dims)

  • bias: Bias of size (out_dims, 1) (present if use_bias=true)

source


# Lux.EmbeddingType.
julia
Embedding(in_dims => out_dims; init_weight=randn32)

A lookup table that stores embeddings of dimension out_dims for a vocabulary of size in_dims.

This layer is often used to store word embeddings and retrieve them using indices.

Warning

Unlike Flux.Embedding, this layer does not support using OneHotArray as an input.

Arguments

  • in_dims: number of input dimensions

  • out_dims: number of output dimensions

Keyword Arguments

  • init_weight: initializer for the weight matrix (weight = init_weight(rng, out_dims, in_dims))

Input

  • Integer OR

  • Abstract Vector of Integers OR

  • Abstract Array of Integers

Returns

  • Returns the embedding corresponding to each index in the input. For an N dimensional input, an N + 1 dimensional output is returned.

  • Empty NamedTuple()

source


# Lux.ScaleType.
julia
Scale(dims, activation=identity; init_weight=ones32, init_bias=zeros32, bias::Bool=true)

Create a Sparsely Connected Layer with a very specific structure (only Diagonal Elements are non-zero). The forward pass is given by: y = activation.(weight .* x .+ bias)

Arguments

  • dims: size of the learnable scale and bias parameters.

  • activation: activation function

Keyword Arguments

  • init_weight: initializer for the weight matrix (weight = init_weight(rng, out_dims, in_dims))

  • init_bias: initializer for the bias vector (ignored if use_bias=false)

  • use_bias: Trainable bias can be disabled entirely by setting this to false

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

Input

  • x must be an Array of size (dims..., B) or (dims...[0], ..., dims[k]) for k ≤ size(dims)

Returns

  • Array of size (dims..., B) or (dims...[0], ..., dims[k]) for k ≤ size(dims)

  • Empty NamedTuple()

Parameters

  • weight: Weight Array of size (dims...)

  • bias: Bias of size (dims...)

source


Misc. Helper Layers

# Lux.FlattenLayerType.
julia
FlattenLayer(N = nothing)

Flattens the passed array into a matrix.

Arguments

  • N: Flatten the first N dimensions of the input array. If nothing, then all dimensions (except) are flattened. Note that the batch dimension is never flattened.

Inputs

  • x: AbstractArray

Returns

  • AbstractMatrix of size (:, size(x, ndims(x)))

  • Empty NamedTuple()

source


# Lux.MaxoutType.
julia
Maxout(layers...)
Maxout(; layers...)
Maxout(f::Function, n_alts::Int)

This contains a number of internal layers, each of which receives the same input. Its output is the elementwise maximum of the the internal layers' outputs.

Maxout over linear dense layers satisfies the universal approximation theorem. See [1].

See also Parallel to reduce with other operators.

Arguments

  • Layers can be specified in three formats:
    • A list of N Lux layers

    • Specified as N keyword arguments.

    • A no argument function f and an integer n_alts which specifies the number of layers.

Inputs

  • x: Input that is passed to each of the layers

Returns

  • Output is computed by taking elementwise max of the outputs of the individual layers.

  • Updated state of the layers

Parameters

  • Parameters of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

States

  • States of each layer wrapped in a NamedTuple with fields = layer_1, layer_2, ..., layer_N (naming changes if using the kwargs API)

References

[1] Goodfellow, Warde-Farley, Mirza, Courville & Bengio "Maxout Networks" https://arxiv.org/abs/1302.4389

source


# Lux.NoOpLayerType.
julia
NoOpLayer()

As the name suggests does nothing but allows pretty printing of layers. Whatever input is passed is returned.

source


# Lux.ReshapeLayerType.
julia
ReshapeLayer(dims)

Reshapes the passed array to have a size of (dims..., :)

Arguments

  • dims: The new dimensions of the array (excluding the last dimension).

Inputs

  • x: AbstractArray of any shape which can be reshaped in (dims..., size(x, ndims(x)))

Returns

  • AbstractArray of size (dims..., size(x, ndims(x)))

  • Empty NamedTuple()

source


# Lux.SelectDimType.
julia
SelectDim(dim, i)

Return a view of all the data of the input x where the index for dimension dim equals i. Equivalent to view(x,:,:,...,i,:,:,...) where i is in position d.

Arguments

  • dim: Dimension for indexing

  • i: Index for dimension dim

Inputs

  • x: AbstractArray that can be indexed with view(x,:,:,...,i,:,:,...)

Returns

  • view(x,:,:,...,i,:,:,...) where i is in position d

  • Empty NamedTuple()

source


# Lux.WrappedFunctionType.
julia
WrappedFunction(f)

Wraps a stateless and parameter less function. Might be used when a function is added to Chain. For example, Chain(x -> relu.(x)) would not work and the right thing to do would be Chain((x, ps, st) -> (relu.(x), st)). An easier thing to do would be Chain(WrappedFunction(Base.Fix1(broadcast, relu)))

Arguments

  • f::Function: A stateless and parameterless function

Inputs

  • x: s.t hasmethod(f, (typeof(x),)) is true

Returns

  • Output of f(x)

  • Empty NamedTuple()

source


Normalization Layers

# Lux.BatchNormType.
julia
BatchNorm(chs::Integer, activation=identity; init_bias=zeros32, init_scale=ones32,
          affine=true, track_stats=true, epsilon=1f-5, momentum=0.1f0,
          allow_fast_activation::Bool=true)

Batch Normalization layer.

BatchNorm computes the mean and variance for each D1×...×DN2×1×DN input slice and normalises the input accordingly.

Arguments

  • chs: Size of the channel dimension in your data. Given an array with N dimensions, call the N-1th the channel dimension. For a batch of feature vectors this is just the data dimension, for WHCN images it's the usual channel dimension.

  • activation: After normalization, elementwise activation activation is applied.

Keyword Arguments

  • If track_stats=true, accumulates mean and variance statistics in training phase that will be used to renormalize the input in test phase.

  • epsilon: a value added to the denominator for numerical stability

  • momentum: the value used for the running_mean and running_var computation

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias and scale parameters.

    • init_bias: Controls how the bias is initialized

    • init_scale: Controls how the scale is initialized

Inputs

  • x: Array where size(x, N - 1) = chs and ndims(x) > 2

Returns

  • y: Normalized Array

  • Update model state

Parameters

  • affine=true

    • bias: Bias of shape (chs,)

    • scale: Scale of shape (chs,)

  • affine=false - Empty NamedTuple()

States

  • Statistics if track_stats=true

    • running_mean: Running mean of shape (chs,)

    • running_var: Running variance of shape (chs,)

  • Statistics if track_stats=false

    • running_mean: nothing

    • running_var: nothing

  • training: Used to check if training/inference mode

Use Lux.testmode during inference.

Example

julia
julia> Chain(Dense(784 => 64), BatchNorm(64, relu), Dense(64 => 10), BatchNorm(10))
Chain(
    layer_1 = Dense(784 => 64),         # 50_240 parameters
    layer_2 = BatchNorm(64, relu, affine=true, track_stats=true),  # 128 parameters, plus 129
    layer_3 = Dense(64 => 10),          # 650 parameters
    layer_4 = BatchNorm(10, affine=true, track_stats=true),  # 20 parameters, plus 21
)         # Total: 51_038 parameters,
          #        plus 150 states.

Warning

Passing a batch size of 1, during training will result in NaNs.

See also BatchNorm, InstanceNorm, LayerNorm, WeightNorm

source


# Lux.GroupNormType.
julia
GroupNorm(chs::Integer, groups::Integer, activation=identity; init_bias=zeros32,
          init_scale=ones32, affine=true, epsilon=1f-5,
          allow_fast_activation::Bool=true)

Group Normalization layer.

Arguments

  • chs: Size of the channel dimension in your data. Given an array with N dimensions, call the N-1th the channel dimension. For a batch of feature vectors this is just the data dimension, for WHCN images it's the usual channel dimension.

  • groups is the number of groups along which the statistics are computed. The number of channels must be an integer multiple of the number of groups.

  • activation: After normalization, elementwise activation activation is applied.

Keyword Arguments

  • epsilon: a value added to the denominator for numerical stability

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias and scale parameters.

    • init_bias: Controls how the bias is initialized

    • init_scale: Controls how the scale is initialized

Inputs

  • x: Array where size(x, N - 1) = chs and ndims(x) > 2

Returns

  • y: Normalized Array

  • Update model state

Parameters

  • affine=true

    • bias: Bias of shape (chs,)

    • scale: Scale of shape (chs,)

  • affine=false - Empty NamedTuple()

States

  • training: Used to check if training/inference mode

Use Lux.testmode during inference.

Example

julia
julia> Chain(Dense(784 => 64), GroupNorm(64, 4, relu), Dense(64 => 10), GroupNorm(10, 5))
Chain(
    layer_1 = Dense(784 => 64),         # 50_240 parameters
    layer_2 = GroupNorm(64, 4, relu, affine=true),  # 128 parameters
    layer_3 = Dense(64 => 10),          # 650 parameters
    layer_4 = GroupNorm(10, 5, affine=true),  # 20 parameters
)         # Total: 51_038 parameters,
          #        plus 0 states.

See also GroupNorm, InstanceNorm, LayerNorm, WeightNorm

source


# Lux.InstanceNormType.
julia
InstanceNorm(chs::Integer, activation=identity; init_bias=zeros32, init_scale=ones32,
             affine=true, epsilon=1f-5, allow_fast_activation::Bool=true)

Instance Normalization. For details see [1].

Instance Normalization computes the mean and variance for each D1×...×DN2×1×1` input slice and normalises the input accordingly.

Arguments

  • chs: Size of the channel dimension in your data. Given an array with N dimensions, call the N-1th the channel dimension. For a batch of feature vectors this is just the data dimension, for WHCN images it's the usual channel dimension.

  • activation: After normalization, elementwise activation activation is applied.

Keyword Arguments

  • epsilon: a value added to the denominator for numerical stability

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias and scale parameters.

    • init_bias: Controls how the bias is initialized

    • init_scale: Controls how the scale is initialized

Inputs

  • x: Array where size(x, N - 1) = chs and ndims(x) > 2

Returns

  • y: Normalized Array

  • Update model state

Parameters

  • affine=true

    • bias: Bias of shape (chs,)

    • scale: Scale of shape (chs,)

  • affine=false - Empty NamedTuple()

States

  • training: Used to check if training/inference mode

Use Lux.testmode during inference.

Example

julia
julia> Chain(Dense(784 => 64), InstanceNorm(64, relu), Dense(64 => 10),
           InstanceNorm(10, relu))
Chain(
    layer_1 = Dense(784 => 64),         # 50_240 parameters
    layer_2 = InstanceNorm(64, relu, affine=true),  # 128 parameters, plus 1
    layer_3 = Dense(64 => 10),          # 650 parameters
    layer_4 = InstanceNorm(10, relu, affine=true),  # 20 parameters, plus 1
)         # Total: 51_038 parameters,
          #        plus 2 states.

References

[1] Ulyanov, Dmitry, Andrea Vedaldi, and Victor Lempitsky. "Instance normalization: The missing ingredient for fast stylization." arXiv preprint arXiv:1607.08022 (2016).

See also BatchNorm, GroupNorm, LayerNorm, WeightNorm

source


# Lux.LayerNormType.
julia
LayerNorm(shape::NTuple{N, Int}, activation=identity; epsilon=1f-5, dims=Colon(),
          affine::Bool=true, init_bias=zeros32, init_scale=ones32,)

Computes mean and standard deviation over the whole input array, and uses these to normalize the whole array. Optionally applies an elementwise affine transformation afterwards.

Given an input array x, this layer computes

y=xE[x]Var[x]+ϵγ+β

where γ & β are trainable parameters if affine=true.

Warning

As of v0.5.0, the doc used to say affine::Bool=false, but the code actually had affine::Bool=true as the default. Now the doc reflects the code, so please check whether your assumptions about the default (if made) were invalid.

Arguments

  • shape: Broadcastable shape of input array excluding the batch dimension.

  • activation: After normalization, elementwise activation activation is applied.

Keyword Arguments

  • allow_fast_activation: If true, then certain activations can be approximated with a faster version. The new activation function will be given by NNlib.fast_act(activation)

  • epsilon: a value added to the denominator for numerical stability.

  • dims: Dimensions to normalize the array over.

  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias and scale parameters.

    • init_bias: Controls how the bias is initialized

    • init_scale: Controls how the scale is initialized

Inputs

  • x: AbstractArray

Returns

  • y: Normalized Array

  • Empty NamedTuple()

Parameters

  • affine=false: Empty NamedTuple()

  • affine=true

    • bias: Bias of shape (shape..., 1)

    • scale: Scale of shape (shape..., 1)

source


# Lux.WeightNormType.
julia
WeightNorm(layer::AbstractExplicitLayer, which_params::NTuple{N,Symbol},
           dims::Union{Tuple,Nothing}=nothing)

Applies weight normalization to a parameter in the given layer.

w=gvv

Weight normalization is a reparameterization that decouples the magnitude of a weight tensor from its direction. This updates the parameters in which_params (e.g. weight) using two parameters: one specifying the magnitude (e.g. weight_g) and one specifying the direction (e.g. weight_v).

Arguments

  • layer whose parameters are being reparameterized

  • which_params: parameter names for the parameters being reparameterized

  • By default, a norm over the entire array is computed. Pass dims to modify the dimension.

Inputs

  • x: Should be of valid type for input to layer

Returns

  • Output from layer

  • Updated model state of layer

Parameters

  • normalized: Parameters of layer that are being normalized

  • unnormalized: Parameters of layer that are not being normalized

States

  • Same as that of layer

source


Upsampling

# Lux.PixelShuffleFunction.
julia
PixelShuffle(r::Int)

Pixel shuffling layer with upscale factor r. Usually used for generating higher resolution images while upscaling them.

See NNlib.pixel_shuffle for more details.

PixelShuffle is not a Layer, rather it returns a WrappedFunction with the function set to Base.Fix2(pixel_shuffle, r)

Arguments

  • r: Upscale factor

Inputs

  • x: For 4D-arrays representing N images, the operation converts input size(x) == (W, H, r² x C, N) to output of size (r x W, r x H, C, N). For D-dimensional data, it expects ndims(x) == D + 2 with channel and batch dimensions, and divides the number of channels by rᴰ.

Returns

  • Output of size (r x W, r x H, C, N) for 4D-arrays, and (r x W, r x H, ..., C, N) for D-dimensional data, where D = ndims(x) - 2

source


# Lux.UpsampleType.
julia
Upsample(mode = :nearest; [scale, size]) 
Upsample(scale, mode = :nearest)

Upsampling Layer.

Layer Construction

Option 1

  • mode: Set to :nearest, :linear, :bilinear or :trilinear

Exactly one of two keywords must be specified:

  • If scale is a number, this applies to all but the last two dimensions (channel and batch) of the input. It may also be a tuple, to control dimensions individually.

  • Alternatively, keyword size accepts a tuple, to directly specify the leading dimensions of the output.

Option 2

  • If scale is a number, this applies to all but the last two dimensions (channel and batch) of the input. It may also be a tuple, to control dimensions individually.

  • mode: Set to :nearest, :bilinear or :trilinear

Currently supported upsampling modes and corresponding NNlib's methods are:

  • :nearest -> NNlib.upsample_nearest

  • :bilinear -> NNlib.upsample_bilinear

  • :trilinear -> NNlib.upsample_trilinear

Inputs

  • x: For the input dimensions look into the documentation for the corresponding NNlib function
    • As a rule of thumb, :nearest should work with arrays of arbitrary dimensions

    • :bilinear works with 4D Arrays

    • :trilinear works with 5D Arrays

Returns

  • Upsampled Input of size size or of size (I_1 x scale[1], ..., I_N x scale[N], C, N)

  • Empty NamedTuple()

source