Sebastian Raschka 4e61dc4224 Fix d_out code comment in bonus materials (#715) 4 months ago
..
README.md ad12c8da06 Einsum multi-head attention (#345) 1 year ago
mha-implementations.ipynb 4e61dc4224 Fix d_out code comment in bonus materials (#715) 4 months ago

README.md

More Efficient Multi-Head Attention Implementations

Summary

The figures below summarize the performance benchmarks (lower is better).

 

Forward pass only

 

Forward and backward pass

 

Forward and backward pass after compilation