Optimal Performance Without Static Graphs by Fusing Tensor Operation Streams

Optimal Performance Without Static Graphs by Fusing Tensor Operation Streams(burn.dev)

5 points by nathanielsimard 2 years ago | 1 comment

Happy to share what we have been working on lately. The blog post explores Burn's tensor operation stream strategy, optimizing models through an eager API by creating custom kernels with fused operations. Our custom GELU experiment reveal a remarkable improvement of up to 78 times on our WGPU backend.

nathanielsimard 2 years ago |

Happy to share what we have been working on lately. The blog post explores Burn's tensor operation stream strategy, optimizing models through an eager API by creating custom kernels with fused operations. Our custom GELU experiment reveal a remarkable improvement of up to 78 times on our WGPU backend.