Automatic Differentiation 
Lux is not an AD package, but it composes well with most of the AD packages available in the Julia ecosystem. This document lists the current level of support for various AD packages in Lux. Additionally, we provide some convenience functions for working with AD.
Overview 
| AD Package | Mode | CPU | GPU | TPU | Nested 2nd Order AD | Support Class | 
|---|---|---|---|---|---|---|
| Reactant.jl[1] +Enzyme.jl | Reverse / Forward | ✔️ | ✔️ | ✔️ | ✔️ | Tier I | 
| ChainRules.jl[2] | Reverse | ✔️ | ✔️ | ❌ | ✔️ | Tier I | 
| Enzyme.jl | Reverse / Forward | ✔️ | ❓[3] | ❌ | ❓[3:1] | Tier I[4] | 
| Zygote.jl | Reverse | ✔️ | ✔️ | ❌ | ✔️ | Tier I | 
| ForwardDiff.jl | Forward | ✔️ | ✔️ | ❌ | ✔️ | Tier I | 
| Mooncake.jl | Reverse | ❓[3:2] | ❌ | ❌ | ❌ | Tier III | 
| ReverseDiff.jl | Reverse | ✔️ | ❌ | ❌ | ❌ | Tier IV | 
| Tracker.jl | Reverse | ✔️ | ✔️ | ❌ | ❌ | Tier IV | 
| Diffractor.jl | Forward | ❓[3:3] | ❓[3:4] | ❌ | ❓[3:5] | Tier IV | 
Recommendations 
- For CPU Use cases: - Use - Reactant.jl+- Enzyme.jlfor the best performance as well as mutation-support. When available, this is the most reliable and fastest option.
- Use - Zygote.jlfor the best performance without- Reactant.jl. This is the most reliable and fastest option for CPU for the time-being. (We are working on faster Enzyme support for CPU)
- Use - Enzyme.jl, if there are mutations in the code and/or- Zygote.jlfails.
- If - Enzyme.jlfails for some reason, (open an issue and) try- ReverseDiff.jl(possibly with compiled mode).
 
- For GPU Use cases: - Use - Reactant.jl+- Enzyme.jlfor the best performance. This is the most reliable and fastest option, but presently only supports NVIDIA GPU's. AMD GPUs are currently not supported.
- Use - Zygote.jlfor the best performance on non-NVIDIA GPUs. This is the most reliable and fastest non-- Reactant.jloption for GPU for the time-being. We are working on supporting- Enzyme.jlwithout- Reactant.jlfor GPU as well.
 
- For TPU Use cases: - Use Reactant.jl. This is the only supported (and fastest) option.
 
- Use 
Support Class 
- Tier I: These packages are fully supported and have been tested extensively. Often have special rules to enhance performance. Issues for these backends take the highest priority. 
- Tier II: These packages are supported and extensively tested but often don't have the best performance. Issues against these backends are less critical, but we fix them when possible. (Some specific edge cases, especially with AMDGPU, are known to fail here) 
- Tier III: We don't know if these packages currently work with Lux. We'd love to add tests for these backends, but currently these are not our priority. 
- Tier IV: At some point we may or may not have supported these frameworks. However, these are not maintained at the moment. Whatever code exists for these frameworks in Lux, may be removed in the future (with a breaking release). It is not recommended to use these frameworks with Lux. 
Footnotes 
- Note that - Reactant.jlis not really an AD package, but a tool for compiling functions, including the use of EnzymeMLIR for AD via- Enzyme.jl. We have first-class support for the usage of- Reactant.jlfor inference and training when using- Enzyme.jlfor differentiation. ↩︎
- Note that - ChainRules.jlis not really an AD package, but we have first-class support for packages that use- rrules. ↩︎
- This feature is supported downstream, but we don't extensively test it to ensure that it works with Lux. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ 
- Currently Enzyme outperforms other AD packages in terms of CPU performance. However, there are some edge cases where it might not work with Lux when not using Reactant. We are working on improving the compatibility. Please report any issues you encounter and try Reactant if something fails. ↩︎