Fast and accurate algorithms for matrix multiplication using fused multiply-add and their rounding error analysis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We propose a numerical algorithm for accurate matrix multiplication. When standard floating-point arithmetic fails to deliver sufficient accuracy, high-precision techniques such as pair arithmetic and double-word arithmetic are typically employed, although at the expense of considerable computational cost. The proposed algorithms attain high accuracy by efficiently exploiting Fused Multiply-Add (FMA) operations, which are less computationally expensive than conventional high-precision methods. Numerical experiments confirm the effectiveness of the proposed algorithms. In particular, the proposed algorithms are especially effective in environments where dedicated floating-point adder units are available. Although its accuracy is slightly inferior to that of pair arithmetic or double-word arithmetic, numerical experiments have demonstrated that it achieves about two to three times higher speed. Comparative evaluations with accurate GEMM-based algorithms are conducted on both CPUs and GPUs, and rounding error analyses of the proposed algorithms are also provided.