Updating to Lux v1
Lux v1 is a Major Release, mostly to signify the stability of the API. In this page, we list out a concrete set of changes that need to be made to your code to update to Lux v1. We also list out some new exciting features that were added as part of this release.
LuxLib.jl
Breaking Changes
Old deprecated API with keyword arguments has been removed. See the new docs in LuxLib API for more details.
Default for
layernorm
dims has been changed to exclude the batch dimension.
New Major Features
- Dense layers now support CUDA backend for Enzyme (starting
v1.1
). Wider support for other operations with Enzyme + CUDA is being actively worked on.
LuxCore.jl
Breaking Changes
AbstractExplicitLayer
has been renamed toAbstractLuxLayer
.AbstractExplicitContainerLayer
behaviourThis has been renamed to
AbstractLuxContainerLayer
.Previously,
AbstractExplicitContainerLayer{(:a,)}
(i.e. singleton containers) would produce default initial parameters and states without wrapping them in aNamedTuple{(:a,)}
. This was inconsistent with non-singleton containers, and was a source of confusion. Withv
we return(; a = <parameters>)
and(; a = <states>)
by default. SeeAbstractLuxWrapperLayer
for a replacement of this functionality.
inputsize
has been removed since it was ambiguous and not used anywhere.Changes to
outputsize
:Single argument version has been removed. See LuxCore.jl Pull Request 43 for more details on the rationale behind this change.
Fallback implementation has been moved to
Lux.jl
. (i.e. users using Lux shouldn't see a difference, but ifLux.jl
isn't loaded, this function has error.)- Internally this uses a
NilArray
that is able to compute sizes without actually running the computation.
- Internally this uses a
Functors
andSetfield
have been made into optional dependencies. CertainLuxCore
functionality that rely on these functions, will throw an error if these packages are not loaded.
New Major Features
- Introduction of
AbstractLuxWrapperLayer
. This behaves exactly like the old singleton container. For example, the oldAbstractExplicitContainerLayer{(:a,)}
is equivalent toAbstractLuxWrapperLayer{:a}
.
WeightInitializers.jl
This was a major release to signify the stability of the API. There were no breaking changes. We do support a wider range of RNG types, see Supported RNG Types for more details.
MLDataDevices.jl
This is the most aggressive change that was made. We renamed the LuxDeviceUtils.jl
package to MLDataDevices.jl
, to allow for non-Lux packages to use this shared device management abstraction.
Deprecation of LuxDeviceUtils.jl
This also marks the deprecation of the LuxDeviceUtils.jl
package. We won't be making any updates to that package, including fixing any bugs. All users should switch to MLDataDevices.jl
instead.
Breaking Changes
Lux(___)Device
objects have been renamed to(___)Device
. For example,LuxCUDADevice
has been renamed toCUDADevice
.Lux(___)Adaptor
objects have been removed. The correspondingDevice
objects should be used directly instead.
New Major Features
DeviceIterator
provides a generalization ofCUDA.CuIterator
and works for all backends and more data types (usingFunctors.jl
).MLUtils.DataLoader |> gdev
now returns aDeviceIterator
instead of being a no-op.
Lux.jl
Breaking Changes (Removed Functionality)
Direct reexport of
NNlib
has been removed. We reexport selected functionality fromNNlib
. Direactly loadNNlib
if you need to use the other functions.Flattening of
Chain
layers has been removed, and the correspondingdisable_optimizations
kwarg has been removed.Some layers overloaded
Base.keys
, these have been removed. These were mostly un-documented and weren't supposed to be used outside of theLux.jl
package.Training.TrainState
construction withrng
has been removed.Older versions of Preferences have been removed.
disable_stacktrace_truncation!
has been removed. From Julia 1.9 onwards, stacktrace truncation is enabled by default.Certain Experimental features were present outside the
Lux.Experimental
module. These have been removed, use them viaLux.Experimental
instead. Run Julia with withdepwarn
aserror
and Luxv0.5
to see the deprecations.Lux.Experimental.@layer_map
is not longer needed and has been removed. The name of the variable prevents writing generic functions and is no longer pre-pended to theKeyPath
. See the docstring ofLux.Experimental.layer_map
for more details.allow_fast_activation
kwarg has been removed completely. Pass an anonymous function as the activation to prevent internal modivations to the activation function.
Breaking Changes (Moved Functionality)
Lux.Experimental.Training
has been moved toLux.Training
. We guarantee SemVar on this new module.Lux.cpu
andLux.gpu
have been removed. Usecpu_device
andgpu_device
instead.Experimental.@compact
can be directly used via@compact
now.Experimental.StatefulLuxLayer
has been moved toLux.StatefulLuxLayer
.st_fixed_path
kwarg has been removed fromLux.StatefulLuxLayer
, instead use it asStatefulLuxLayer{st_fixed_path}(...)
.Strings as inputs to
Lux.Experimental.layer_map
andLux.Experimental.@debug_mode
are removed, useFunctors.KeyPath
instead.CrossCor
has been removed. UseConv(args...; kwargs..., cross_correlation=true)
instead.
Breaking Changes (Changes in Defaults)
Conv
andConvTranspose
use an initialization based on the activation function, taken from Pytorch. Pytorch assumes the activation function isleakyrelu
to compute the gain, however, we compute the gain based on the activation function passed in to the layer.Upsample
now has analign_corners
keyword argument, which defaults tofalse
. Previously this was alwaystrue
.Dense
andBilinear
have updated default initializations to align with the defaults from Pytorch. See the documentation for more details.InstanceNorm
now defaults toaffine=false
instead ofaffine=true
.Embedding
now defaults toinit_weight=rand32
instead ofinit_weight=randn32
.Recurrent Cells -
RNNCell
,LSTMCell
, andGRUCell
now have different default initializations. See the documentation for more details.
New Features
InstanceNorm
now supports tracking statistics.RNNCell
andLSTMCell
addbias_ih
andbias_hh
to the parameters to align with Pytorch. Both are controlled usinginit_bias
anduse_bias
.ConvTranspose
allowsflipkernel=true
viacross_correlation=true
. This makes it efficient for MIOpen.ConvTranspose
now has anoutpad
keyword argument, which is used to increase the size of the output in the desired dimensions.Pooling Layers based on lpnorm have been added –
LPPool
,GlobalLPPool
, andAdaptiveLPPool
.