I'm testing it on a 3-layer perceptron, so memory is less of an issue, but __slots__ seems to speed up the training time by 5%! Pushed the implementation to a branch: https://github.com/noway/yagrad/blob/slots/train.py
Unfortunately it extends the line count past 100 lines, so I'll keep it separate from `main`.
I have my email address on my website (which is in my bio) - don't hesitate to reach out. Cheers!
The added benefit is that all the variables become complex. As long as your loss is real-valued you should be able to backprop through your net and update the parameters.
PyTorch docs mention that complex variables may be used "in audio and other fields": https://pytorch.org/docs/stable/notes/autograd.html#how-is-w...