Technical Report: pdf (1576K).
Note: the version at http://i.stanford.edu/pub/cstr/reports/cs/tr/70/157/CS-TR-70-157.pdf has scanning errors.
The normal algorithm requires n3 + O(n2) multiplications and about the same number of additions. Winograd's algorithm almost halves the number of multiplications at the expense of more additions. Strassen's algorithm reduces the total number of operations to O(n2.82) by recursively multiplying 2n × 2n matrices using seven n × n matrix multiplications.
Floating-point error bounds are obtained for both Strassen's and Winograd's methods. It is shown that Strassen's method satisfies a certain numerical stability property (albeit weaker than that satisfied by the normal method); and that scaling is essential for numerical accuracy using Winograd's method.
In practical cases, for moderate n, Winograd's method appears to be slightly faster than the other two methods, but the gain is, at most, about 20 percent. Strassen's method should be faster for sufficiently large n, and this will be important in the future as memory sizes increase.
An attempt to generalize Strassen's method is described.
The error analysis of Strassen's algorithm shows that, although it is not as stable componentwise as the normal algorithm, it is sufficiently stable to be usable in implementations of Level 3 Basic Linear Algebra Subroutines (BLAS). (This was not pointed out in the report since Level 3 BLAS had not been invented at the time.) For more recent results along similar lines, see J. Demmel, I. Dumitriu and O. Holtz, "Fast linear algebra is stable", Numer. Math. 108 (2007), 59-91.
For an error analysis of the application of Winograd's identity to LU factorization, see Brent .
The conjecture (on page 42) that 3x3 matrix multiplication can be performed with 23 multiplications was proved by Julian D. Laderman [Bull. Amer. Math. Soc. 82, 1, Jan. 1976, 126-128].
Recently, a paper by Fawzi et al [Nature Vol 610, 6 Oct 2022, pg. 47], discusses some new algorithms, notably one with 47 multiplications for 4x4 matrix multiplication (improving on Strassen's 49). However, it should be noted that the new algorithms only seem to work over the finite field GF(2) with two elements, and not over the rational, real, or complex fields. Thus, their application to scientific computation is not clear.
For other developments see Chapters 14-15 of Burgisser, Clausen and Shokrollahi, Algebraic Complexity Theory, Springer, 1997; also recent papers by Chris Umans et al on an interesting approach via group algebras.
Go to next publication
Return to Richard Brent's index page