Boltz
Accelerate ⚡ your ML research using pre-built Deep Learning Models with Lux.
Index
Boltz.ClassTokens
Boltz.MultiHeadAttention
Boltz.ViPosEmbedding
Boltz._fast_chunk
Boltz._flatten_spatial
Boltz._seconddimmean
Boltz._vgg_block
Boltz._vgg_classifier_layers
Boltz._vgg_convolutional_layers
Boltz.transformer_encoder
Boltz.vgg
Computer Vision Models
Classification Models: Native Lux Models
MODEL NAME | FUNCTION | NAME | PRETRAINED | TOP 1 ACCURACY (%) | TOP 5 ACCURACY (%) |
---|---|---|---|---|---|
VGG | vgg | :vgg11 | ✅ | 67.35 | 87.91 |
VGG | vgg | :vgg13 | ✅ | 68.40 | 88.48 |
VGG | vgg | :vgg16 | ✅ | 70.24 | 89.80 |
VGG | vgg | :vgg19 | ✅ | 71.09 | 90.27 |
VGG | vgg | :vgg11_bn | ✅ | 69.09 | 88.94 |
VGG | vgg | :vgg13_bn | ✅ | 69.66 | 89.49 |
VGG | vgg | :vgg16_bn | ✅ | 72.11 | 91.02 |
VGG | vgg | :vgg19_bn | ✅ | 72.95 | 91.32 |
Vision Transformer | vision_transformer | :tiny | 🚫 | ||
Vision Transformer | vision_transformer | :small | 🚫 | ||
Vision Transformer | vision_transformer | :base | 🚫 | ||
Vision Transformer | vision_transformer | :large | 🚫 | ||
Vision Transformer | vision_transformer | :huge | 🚫 | ||
Vision Transformer | vision_transformer | :giant | 🚫 | ||
Vision Transformer | vision_transformer | :gigantic | 🚫 |
Building Blocks
ClassTokens(dim; init=Lux.zeros32)
Appends class tokens to an input with embedding dimension dim
for use in many vision transformer namels.
MultiHeadAttention(in_planes::Int, number_heads::Int; qkv_bias::Bool=false,
attention_dropout_rate::T=0.0f0,
projection_dropout_rate::T=0.0f0) where {T}
Multi-head self-attention layer
ViPosEmbedding(embedsize, npatches;
init = (rng, dims...) -> randn(rng, Float32, dims...))
Positional embedding layer used by many vision transformer-like namels.
transformer_encoder(in_planes, depth, number_heads; mlp_ratio = 4.0f0, dropout = 0.0f0)
Transformer as used in the base ViT architecture. (reference).
Arguments
in_planes
: number of input channelsdepth
: number of attention blocksnumber_heads
: number of attention headsmlp_ratio
: ratio of MLP layers to the number of input channelsdropout_rate
: dropout rate
vgg(imsize; config, inchannels, batchnorm = false, nclasses, fcsize, dropout)
Create a VGG model (reference).
Arguments
imsize
: input image width and height as a tupleconfig
: the configuration for the convolution layersinchannels
: number of input channelsbatchnorm
: set totrue
to use batch normalization after each convolutionnclasses
: number of output classesfcsize
: intermediate fully connected layer size (seeMetalhead._vgg_classifier_layers
)dropout
: dropout level between fully connected layers
Non-Public API
_fast_chunk(x::AbstractArray, ::Val{n}, ::Val{dim})
Type-stable and faster version of MLUtils.chunk
_flatten_spatial(x::AbstractArray{T, 4})
Flattens the first 2 dimensions of x
, and permutes the remaining dimensions to (2, 1, 3)
_vgg_block(input_filters, output_filters, depth, batchnorm)
A VGG block of convolution layers (reference).
Arguments
input_filters
: number of input feature mapsoutput_filters
: number of output feature mapsdepth
: number of convolution/convolution + batch norm layersbatchnorm
: set totrue
to include batch normalization after each convolution
_vgg_classifier_layers(imsize, nclasses, fcsize, dropout)
Create VGG classifier (fully connected) layers (reference).
Arguments
imsize
: tuple(width, height, channels)
indicating the size after the convolution layers (seeMetalhead._vgg_convolutional_layers
)nclasses
: number of output classesfcsize
: input and output size of the intermediate fully connected layerdropout
: the dropout level between each fully connected layer
_vgg_convolutional_layers(config, batchnorm, inchannels)
Create VGG convolution layers (reference).
Arguments
config
: vector of tuples(output_channels, num_convolutions)
for each block (seeMetalhead._vgg_block
)batchnorm
: set totrue
to include batch normalization after each convolutioninchannels
: number of input channels
Classification Models: Imported from Metalhead.jl
Tip
You need to load Flux
and Metalhead
before using these models.
MODEL NAME | FUNCTION | NAME | PRETRAINED | TOP 1 ACCURACY (%) | TOP 5 ACCURACY (%) |
---|---|---|---|---|---|
AlexNet | alexnet | :alexnet | ✅ | 54.48 | 77.72 |
ResNet | resnet | :resnet18 | 🚫 | 68.08 | 88.44 |
ResNet | resnet | :resnet34 | 🚫 | 72.13 | 90.91 |
ResNet | resnet | :resnet50 | 🚫 | 74.55 | 92.36 |
ResNet | resnet | :resnet101 | 🚫 | 74.81 | 92.36 |
ResNet | resnet | :resnet152 | 🚫 | 77.63 | 93.84 |
ConvMixer | convmixer | :small | 🚫 | ||
ConvMixer | convmixer | :base | 🚫 | ||
ConvMixer | convmixer | :large | 🚫 | ||
DenseNet | densenet | :densenet121 | 🚫 | ||
DenseNet | densenet | :densenet161 | 🚫 | ||
DenseNet | densenet | :densenet169 | 🚫 | ||
DenseNet | densenet | :densenet201 | 🚫 | ||
GoogleNet | googlenet | :googlenet | 🚫 | ||
MobileNet | mobilenet | :mobilenet_v1 | 🚫 | ||
MobileNet | mobilenet | :mobilenet_v2 | 🚫 | ||
MobileNet | mobilenet | :mobilenet_v3_small | 🚫 | ||
MobileNet | mobilenet | :mobilenet_v3_large | 🚫 | ||
ResNeXT | resnext | :resnext50 | 🚫 | ||
ResNeXT | resnext | :resnext101 | 🚫 | ||
ResNeXT | resnext | :resnext152 | 🚫 |
These models can be created using <FUNCTION>(<NAME>; pretrained = <PRETRAINED>)
Preprocessing
All the pretrained models require that the images be normalized with the parameters mean = [0.485f0, 0.456f0, 0.406f0]
and std = [0.229f0, 0.224f0, 0.225f0]
.