Boltz
Accelerate ⚡ your ML research using pre-built Deep Learning Models with Lux.
Index
Boltz.ClassTokensBoltz.MultiHeadAttentionBoltz.ViPosEmbeddingBoltz._fast_chunkBoltz._flatten_spatialBoltz._seconddimmeanBoltz._vgg_blockBoltz._vgg_classifier_layersBoltz._vgg_convolutional_layersBoltz.transformer_encoderBoltz.vgg
Computer Vision Models
Classification Models: Native Lux Models
| MODEL NAME | FUNCTION | NAME | PRETRAINED | TOP 1 ACCURACY (%) | TOP 5 ACCURACY (%) |
|---|---|---|---|---|---|
| VGG | vgg | :vgg11 | ✅ | 67.35 | 87.91 |
| VGG | vgg | :vgg13 | ✅ | 68.40 | 88.48 |
| VGG | vgg | :vgg16 | ✅ | 70.24 | 89.80 |
| VGG | vgg | :vgg19 | ✅ | 71.09 | 90.27 |
| VGG | vgg | :vgg11_bn | ✅ | 69.09 | 88.94 |
| VGG | vgg | :vgg13_bn | ✅ | 69.66 | 89.49 |
| VGG | vgg | :vgg16_bn | ✅ | 72.11 | 91.02 |
| VGG | vgg | :vgg19_bn | ✅ | 72.95 | 91.32 |
| Vision Transformer | vision_transformer | :tiny | 🚫 | ||
| Vision Transformer | vision_transformer | :small | 🚫 | ||
| Vision Transformer | vision_transformer | :base | 🚫 | ||
| Vision Transformer | vision_transformer | :large | 🚫 | ||
| Vision Transformer | vision_transformer | :huge | 🚫 | ||
| Vision Transformer | vision_transformer | :giant | 🚫 | ||
| Vision Transformer | vision_transformer | :gigantic | 🚫 |
Building Blocks
ClassTokens(dim; init=Lux.zeros32)Appends class tokens to an input with embedding dimension dim for use in many vision transformer namels.
MultiHeadAttention(in_planes::Int, number_heads::Int; qkv_bias::Bool=false,
attention_dropout_rate::T=0.0f0,
projection_dropout_rate::T=0.0f0) where {T}Multi-head self-attention layer
ViPosEmbedding(embedsize, npatches;
init = (rng, dims...) -> randn(rng, Float32, dims...))Positional embedding layer used by many vision transformer-like namels.
transformer_encoder(in_planes, depth, number_heads; mlp_ratio = 4.0f0, dropout = 0.0f0)Transformer as used in the base ViT architecture. (reference).
Arguments
in_planes: number of input channelsdepth: number of attention blocksnumber_heads: number of attention headsmlp_ratio: ratio of MLP layers to the number of input channelsdropout_rate: dropout rate
vgg(imsize; config, inchannels, batchnorm = false, nclasses, fcsize, dropout)Create a VGG model (reference).
Arguments
imsize: input image width and height as a tupleconfig: the configuration for the convolution layersinchannels: number of input channelsbatchnorm: set totrueto use batch normalization after each convolutionnclasses: number of output classesfcsize: intermediate fully connected layer size (seeMetalhead._vgg_classifier_layers)dropout: dropout level between fully connected layers
Non-Public API
_fast_chunk(x::AbstractArray, ::Val{n}, ::Val{dim})Type-stable and faster version of MLUtils.chunk
_flatten_spatial(x::AbstractArray{T, 4})Flattens the first 2 dimensions of x, and permutes the remaining dimensions to (2, 1, 3)
_vgg_block(input_filters, output_filters, depth, batchnorm)A VGG block of convolution layers (reference).
Arguments
input_filters: number of input feature mapsoutput_filters: number of output feature mapsdepth: number of convolution/convolution + batch norm layersbatchnorm: set totrueto include batch normalization after each convolution
_vgg_classifier_layers(imsize, nclasses, fcsize, dropout)Create VGG classifier (fully connected) layers (reference).
Arguments
imsize: tuple(width, height, channels)indicating the size after the convolution layers (seeMetalhead._vgg_convolutional_layers)nclasses: number of output classesfcsize: input and output size of the intermediate fully connected layerdropout: the dropout level between each fully connected layer
_vgg_convolutional_layers(config, batchnorm, inchannels)Create VGG convolution layers (reference).
Arguments
config: vector of tuples(output_channels, num_convolutions)for each block (seeMetalhead._vgg_block)batchnorm: set totrueto include batch normalization after each convolutioninchannels: number of input channels
Classification Models: Imported from Metalhead.jl
Tip
You need to load Flux and Metalhead before using these models.
| MODEL NAME | FUNCTION | NAME | PRETRAINED | TOP 1 ACCURACY (%) | TOP 5 ACCURACY (%) |
|---|---|---|---|---|---|
| AlexNet | alexnet | :alexnet | ✅ | 54.48 | 77.72 |
| ResNet | resnet | :resnet18 | 🚫 | 68.08 | 88.44 |
| ResNet | resnet | :resnet34 | 🚫 | 72.13 | 90.91 |
| ResNet | resnet | :resnet50 | 🚫 | 74.55 | 92.36 |
| ResNet | resnet | :resnet101 | 🚫 | 74.81 | 92.36 |
| ResNet | resnet | :resnet152 | 🚫 | 77.63 | 93.84 |
| ConvMixer | convmixer | :small | 🚫 | ||
| ConvMixer | convmixer | :base | 🚫 | ||
| ConvMixer | convmixer | :large | 🚫 | ||
| DenseNet | densenet | :densenet121 | 🚫 | ||
| DenseNet | densenet | :densenet161 | 🚫 | ||
| DenseNet | densenet | :densenet169 | 🚫 | ||
| DenseNet | densenet | :densenet201 | 🚫 | ||
| GoogleNet | googlenet | :googlenet | 🚫 | ||
| MobileNet | mobilenet | :mobilenet_v1 | 🚫 | ||
| MobileNet | mobilenet | :mobilenet_v2 | 🚫 | ||
| MobileNet | mobilenet | :mobilenet_v3_small | 🚫 | ||
| MobileNet | mobilenet | :mobilenet_v3_large | 🚫 | ||
| ResNeXT | resnext | :resnext50 | 🚫 | ||
| ResNeXT | resnext | :resnext101 | 🚫 | ||
| ResNeXT | resnext | :resnext152 | 🚫 |
These models can be created using <FUNCTION>(<NAME>; pretrained = <PRETRAINED>)
Preprocessing
All the pretrained models require that the images be normalized with the parameters mean = [0.485f0, 0.456f0, 0.406f0] and std = [0.229f0, 0.224f0, 0.225f0].