\documentstyle[12pt,psfig]{article}
%
\newcommand{\Frac}[2] {\frac{\textstyle #1} {\textstyle #2}}
\newcommand{\Int }    {\displaystyle \int}
\newcommand{\Oint}    {\displaystyle \oint}
\newcommand{\Sum }    {\displaystyle \sum}
\newcommand{\dt}[1]{\Frac{\partial #1}{\partial t}}
\newcommand{\dx}[1]{\Frac{\partial #1}{\partial x}}
\newcommand{\dxx}[1]{\Frac{\partial^2 #1}{\partial x^{2}}}
\newcommand{\dxy}[1]{\Frac{\partial^2 #1}{\partial x\partial y}}
\newcommand{\dy}[1]{\Frac{\partial #1}{\partial y}}
\newcommand{\dyy}[1]{\Frac{\partial^2 #1}{\partial y^{2}}}
\newcommand{\dxi}[1]{\Frac{\partial^2 #1}{\partial x_i^2}}
\newcommand{\pmb}[1]  {{\underline{#1}}}
\newcommand{\Half}    {\frac{1}{2}}
\newcommand{\half} {\frac{1}{2}}
%
%\renewcommand{\rmdefault}{ptm}
%\renewcommand{\sldefault}{phv}
%\renewcommand{\ttdefault}{pcr}
%
\begin{document}
\pagestyle{empty}
\begin{center}
{\large \bf MULTILEVEL PARALLEL COMPUTATIONS AND DOMAIN DECOMPOSITION}
\hfill\\
\hfill\\
{\large O. Goyon}
\hfill\\
{Olivier.Goyon@umist.ac.uk}
\hfill\\
\hfill\\
{UMIST\\
        Department of Mechanical Engineering\\
        PO Box 88, Manchester, M60 1QD United Kingdom}
\end{center}
%

A computational framework for solving the 2D unsteady vorticity-velocity
formulation of the Navier-Stokes equations
on parallel computers using domain decomposition and message-passing
is introduced. The numerical scheme adopted is almost explicit in time
owing to cyclic-like reductions, but has less stringent
time-step restrictions than classical explicit methods (Goyon [1]).
We start by developing a simple model for the balance between work and
communication in the parallel context, according to our schemes.
This model is then validated by numerical tests.
Particular attention focuse finally on speedup and parallel performance
achieved on UMIST's SGI ORIGIN multi-processor machine running PVM
(Beguelin et al [2]),
using a generic 2D transport equation.
Parallelization is performed
in the spirit of
the Single Program Multiple Data (SPMD) model in which
processors communicate to others only
information needed to compute a part of the solution for each half time
step.
Each processor of the parallel machine
updates a group of rows in the first half-time step and a group of
columns in the second half-time step.
The penalty is the management of sub-domains associated with different
processors and related interfaces, but communication between processors
are held down to a minimum.

Very encouraging first results, with the generic transport equation,
are obtained in term of total efficiency. The extension to the method
to the full set of equations is rather straightforward.\\
\hfill\\
{\bf References}
\hfill\\
$[1]$ O. Goyon,
{Int. J. for Num. Meth. in Fluids} {22}, pp. 937-959 (1996).
\hfill\\
$[2]$  A. Beguelin, J. Dongarra, A. Geist, W. Jiang, R. Manchek and
V. Sunderam,
{ORNL/TM-12187} Oak Ridge National Laboratory (1993).

\end{document}