\documentstyle[12pt,a4]{article} \include{/home/tiaya/latex/yanglic/defs} \title{Cache Management for the IQMR Method on Massively Parallel Computers} \author{\em Tianruo Yang $\dagger$, Hai-Xiang Lin $\ddagger$ \\ \em $\dagger$ Department of Computer and Information Science \\ \em Link\"oping University, S-581 83, Link\"oping, Sweden \\ \em $\ddagger$ Department of Technical Mathematics and Computer Science \\ \em TU Delft, P.O. Box 356, 2600 GA Delft, The Netherlands} %\date{\today} \date{} \begin{document} \maketitle \begin{abstract} For the solutions of linear systems of equations with unsymmetric coefficient sparse matrices, we have proposed an improved version of the quasi-minimal residual (IQMR) method by rescheduling the algorithm without changing numerical stability, but the algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time. Therefore, the cost of global communication on parallel computers can be significantly reduced. In this paper, we mainly investigate the efficient parallelization technique with regards to cache management strategy, involving a cache mirror, for use in matrix-vector multiplications. Use of the cache mirror allows our IQMR parallel implementation to avoid cache conflicts and shows that using noncached loads further avoids cache conflicts when accessing nonreused data. The cache minor can be also used for vector updates and inner products to get effect. From experimental results we demonstrate around much better parallel performance over the conventional Fortran or C implementation on RISC based MIMD massively parallel computers. \end{abstract} \end{document}