Computer Vision Models (Vision
API)
Native Lux Models
VGG(imsize; config, inchannels, batchnorm = false, nclasses, fcsize, dropout)
Create a VGG model [1].
Arguments
imsize
: input image width and height as a tupleconfig
: the configuration for the convolution layersinchannels
: number of input channelsbatchnorm
: set totrue
to use batch normalization after each convolutionnclasses
: number of output classesfcsize
: intermediate fully connected layer sizedropout
: dropout level between fully connected layers
References
[1] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
VGG(depth::Int; batchnorm=false, kwargs...)
Create a VGG model [1] with ImageNet Configuration.
Arguments
depth::Int
: the depth of the VGG model. Choices: {11
,13
,16
,19
}.
Keyword Arguments
batchnorm = false
: set totrue
to use batch normalization after each convolution.pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
VisionTransformer(name::Symbol; kwargs...)
Creates a Vision Transformer model with the specified configuration.
Arguments
name::Symbol
: name of the Vision Transformer model to create. The following models are available:
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
Imported from Metalhead.jl
Tip
You need to load Flux
and Metalhead
before using these models.
AlexNet(; kwargs...)
Create an AlexNet model [1]
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012): 1097-1105.
ConvMixer(name::Symbol; kwargs...)
Create a ConvMixer model [1].
Arguments
name::Symbol
: The name of the ConvMixer model. Must be one of:base
,:small
, or:large
.
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Zhu, Zhuoyuan, et al. "ConvMixer: A Convolutional Neural Network with Faster Depth-wise Convolutions for Computer Vision." arXiv preprint arXiv:1911.11907 (2019).
DenseNet(depth::Int; kwargs...)
Create a DenseNet model [1].
Arguments
depth::Int
: The depth of the DenseNet model. Must be one of 121, 161, 169, or 201.
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
GoogLeNet(; kwargs...)
Create a GoogLeNet model [1].
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
MobileNet(name::Symbol; kwargs...)
Create a MobileNet model [1, 2, 3].
Arguments
name::Symbol
: The name of the MobileNet model. Must be one of:v1
,:v2
,:v3_small
, or:v3_large
.
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017). [2] Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. [3] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam. "Searching for MobileNetV3." arXiv preprint arXiv:1905.02244. 2019.
ResNet(depth::Int; kwargs...)
Create a ResNet model [1].
Arguments
depth::Int
: The depth of the ResNet model. Must be one of 18, 34, 50, 101, or 152.
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
ResNeXt(depth::Int; kwargs...)
Create a ResNeXt model [1].
Arguments
depth::Int
: The depth of the ResNeXt model. Must be one of 50, 101, or 152.
Keyword Arguments
pretrained::Bool=false
: Iftrue
, returns a pretrained model.rng::Union{Nothing, AbstractRNG}=nothing
: Random number generator.seed::Int=0
: Random seed.initialized::Val{Bool}=Val(true)
: IfVal(true)
, returns(model, parameters, states)
, otherwise justmodel
.
References
[1] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He, Ross Gorshick, and Piotr Dollár. "Aggregated residual transformations for deep neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Pretrained Models
Tip
Pass pretrained=true
to the model constructor to load the pretrained weights.
MODEL | TOP 1 ACCURACY (%) | TOP 5 ACCURACY (%) |
---|---|---|
AlexNet() | 54.48 | 77.72 |
VGG(11) | 67.35 | 87.91 |
VGG(13) | 68.40 | 88.48 |
VGG(16) | 70.24 | 89.80 |
VGG(19) | 71.09 | 90.27 |
VGG(11; batchnorm=true) | 69.09 | 88.94 |
VGG(13; batchnorm=true) | 69.66 | 89.49 |
VGG(16; batchnorm=true) | 72.11 | 91.02 |
VGG(19; batchnorm=true) | 72.95 | 91.32 |
Preprocessing
All the pretrained models require that the images be normalized with the parameters mean = [0.485f0, 0.456f0, 0.406f0]
and std = [0.229f0, 0.224f0, 0.225f0]
.