Updating to Lux v1
Lux v1 is a Major Release, mostly to signify the stability of the API. In this page, we list out a concrete set of changes that need to be made to your code to update to Lux v1. We also list out some new exciting features that were added as part of this release.
LuxLib.jl
Breaking Changes
Old deprecated API with keyword arguments has been removed. See the new docs in LuxLib API for more details.
Default for
layernormdims has been changed to exclude the batch dimension.
New Major Features
- Dense layers now support CUDA backend for Enzyme (starting
v1.1). Wider support for other operations with Enzyme + CUDA is being actively worked on.
LuxCore.jl
Breaking Changes
AbstractExplicitLayerhas been renamed toAbstractLuxLayer.AbstractExplicitContainerLayerbehaviourThis has been renamed to
AbstractLuxContainerLayer.Previously,
AbstractExplicitContainerLayer{(:a,)}(i.e. singleton containers) would produce default initial parameters and states without wrapping them in aNamedTuple{(:a,)}. This was inconsistent with non-singleton containers, and was a source of confusion. Withvwe return(; a = <parameters>)and(; a = <states>)by default. SeeAbstractLuxWrapperLayerfor a replacement of this functionality.
inputsizehas been removed since it was ambiguous and not used anywhere.Changes to
outputsize:Single argument version has been removed. See LuxCore.jl Pull Request 43 for more details on the rationale behind this change.
Fallback implementation has been moved to
Lux.jl. (i.e. users using Lux shouldn't see a difference, but ifLux.jlisn't loaded, this function has error.)- Internally this uses a
NilArraythat is able to compute sizes without actually running the computation.
- Internally this uses a
FunctorsandSetfieldhave been made into optional dependencies. CertainLuxCorefunctionality that rely on these functions, will throw an error if these packages are not loaded.
New Major Features
- Introduction of
AbstractLuxWrapperLayer. This behaves exactly like the old singleton container. For example, the oldAbstractExplicitContainerLayer{(:a,)}is equivalent toAbstractLuxWrapperLayer{:a}.
WeightInitializers.jl
This was a major release to signify the stability of the API. There were no breaking changes. We do support a wider range of RNG types, see Supported RNG Types for more details.
MLDataDevices.jl
This is the most aggressive change that was made. We renamed the LuxDeviceUtils.jl package to MLDataDevices.jl, to allow for non-Lux packages to use this shared device management abstraction.
Deprecation of LuxDeviceUtils.jl
This also marks the deprecation of the LuxDeviceUtils.jl package. We won't be making any updates to that package, including fixing any bugs. All users should switch to MLDataDevices.jl instead.
Breaking Changes
Lux(___)Deviceobjects have been renamed to(___)Device. For example,LuxCUDADevicehas been renamed toCUDADevice.Lux(___)Adaptorobjects have been removed. The correspondingDeviceobjects should be used directly instead.
New Major Features
DeviceIteratorprovides a generalization ofCUDA.CuIteratorand works for all backends and more data types (usingFunctors.jl).MLUtils.DataLoader |> gdevnow returns aDeviceIteratorinstead of being a no-op.
Lux.jl
Breaking Changes (Removed Functionality)
Direct reexport of
NNlibhas been removed. We reexport selected functionality fromNNlib. Direactly loadNNlibif you need to use the other functions.Flattening of
Chainlayers has been removed, and the correspondingdisable_optimizationskwarg has been removed.Some layers overloaded
Base.keys, these have been removed. These were mostly un-documented and weren't supposed to be used outside of theLux.jlpackage.Training.TrainStateconstruction withrnghas been removed.Older versions of Preferences have been removed.
disable_stacktrace_truncation!has been removed. From Julia 1.9 onwards, stacktrace truncation is enabled by default.Certain Experimental features were present outside the
Lux.Experimentalmodule. These have been removed, use them viaLux.Experimentalinstead. Run Julia with withdepwarnaserrorand Luxv0.5to see the deprecations.Lux.Experimental.@layer_mapis not longer needed and has been removed. The name of the variable prevents writing generic functions and is no longer pre-pended to theKeyPath. See the docstring ofLux.Experimental.layer_mapfor more details.allow_fast_activationkwarg has been removed completely. Pass an anonymous function as the activation to prevent internal modivations to the activation function.
Breaking Changes (Moved Functionality)
Lux.Experimental.Traininghas been moved toLux.Training. We guarantee SemVar on this new module.Lux.cpuandLux.gpuhave been removed. Usecpu_deviceandgpu_deviceinstead.Experimental.@compactcan be directly used via@compactnow.Experimental.StatefulLuxLayerhas been moved toLux.StatefulLuxLayer.st_fixed_pathkwarg has been removed fromLux.StatefulLuxLayer, instead use it asStatefulLuxLayer{st_fixed_path}(...).Strings as inputs to
Lux.Experimental.layer_mapandLux.Experimental.@debug_modeare removed, useFunctors.KeyPathinstead.CrossCorhas been removed. UseConv(args...; kwargs..., cross_correlation=true)instead.
Breaking Changes (Changes in Defaults)
ConvandConvTransposeuse an initialization based on the activation function, taken from Pytorch. Pytorch assumes the activation function isleakyreluto compute the gain, however, we compute the gain based on the activation function passed in to the layer.Upsamplenow has analign_cornerskeyword argument, which defaults tofalse. Previously this was alwaystrue.DenseandBilinearhave updated default initializations to align with the defaults from Pytorch. See the documentation for more details.InstanceNormnow defaults toaffine=falseinstead ofaffine=true.Embeddingnow defaults toinit_weight=rand32instead ofinit_weight=randn32.Recurrent Cells -
RNNCell,LSTMCell, andGRUCellnow have different default initializations. See the documentation for more details.
New Features
InstanceNormnow supports tracking statistics.RNNCellandLSTMCelladdbias_ihandbias_hhto the parameters to align with Pytorch. Both are controlled usinginit_biasanduse_bias.ConvTransposeallowsflipkernel=trueviacross_correlation=true. This makes it efficient for MIOpen.ConvTransposenow has anoutpadkeyword argument, which is used to increase the size of the output in the desired dimensions.Pooling Layers based on lpnorm have been added –
LPPool,GlobalLPPool, andAdaptiveLPPool.