Algorithm 898: efficient multiplication of dense matrices over GF(2). We describe an efficient implementation of a hierarchy of algorithms for multiplication of dense matrices over the field with two elements (đť”˝ 2 ). In particular we present our implementation â€“ in the M4RI library â€“ of Strassen-Winograd matrix multiplication and the â€śMethod of the Four Russians for Multiplicationâ€ť (M4RM) and compare it against other available implementations. Good performance is demonstrated on AMDâ€™s Opteron processor and particulary good performance on Intelâ€™s Core 2 Duo processor. The open-source M4RI library is available as a stand-alone package as well as part of the Sage mathematics system. In machine terms, addition in đť”˝ 2 is logical-XOR, and multiplication is logical-AND, thus a machine word of 64 bits allows one to operate on 64 elements of đť”˝ 2 in parallel: at most one CPU cycle for 64 parallel additions or multiplications. As such, element-wise operations over đť”˝ 2 are relatively cheap. In fact, in this paper, we conclude that the actual bottlenecks are memory reads and writes and issues of data locality. We present our empirical findings in relation to minimizing these and give an analysis thereof.

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element