We updated the API to remove instances where the device chosen was the default one, potentially causing bugs due to device mismatch. Now, our API is more explicit about where the device should be specified, aligning well with the Rust philosophy.
The book has seen various updates, including a new section on dataset manipulation requested by the community. We also plan to create a contributor guide, to help new contributors get familiar with the internals of the project.
A lot of work has been done to improve our JIT compiler, where we can fuse WebGPU tensor operations into a single kernel for impressive performance improvement. We added automatic vectorization of element-wise operations, as well as integration with autotune. Additionally, kernels created on-the-fly can now be executed in-place for reduced memory usage. We now support running multiple optimization streams independantly, which helps when metric updates and training run on the same device, but different threads. This feature isn't enabled by default yet, but you can enable it with a backend decorator. Future releases will add more optimizations to the compiler, and we will probably ship it by default. We also have plans to add other compilation targets in addition to WebGPU, namely Vulkan and CUDA.
One of the major quality of life improvements is the addition of the new PyTorch recorder that allows loading PyTorch weights into Burn modules. We also support specifying regex to dynamically map the weights to your Burn model if the structure isn't the same as the PyTorch implementation.
With this new release, we spent a lot of time solidifying our infrastructure, testing our framework on additional OS (Windows and MacOS). Overall, our CI is more mature and allows us to more easily ensure the quality and correctness of every code change across backends and operating systems.
Release Notes: https://github.com/tracel-ai/burn/releases/tag/v0.12.0 Burn Book: https://burn.dev/book/