84 lines
3.8 KiB
Text
84 lines
3.8 KiB
Text
|
GENERAL INFORMATION:
|
||
|
|
||
|
The OCEAN program simulates large-scale ocean movements based on eddy and
|
||
|
boundary currents, and is an enhanced version of the SPLASH Ocean code.
|
||
|
A description of the functionality of this code can be found in the
|
||
|
original SPLASH report. The implementations contained in SPLASH-2
|
||
|
differ from the original SPLASH implementation in the following ways:
|
||
|
|
||
|
(1) The SPLASH-2 implementations are written in C rather than
|
||
|
FORTRAN.
|
||
|
(2) Grids are partitioned into square-like subgrids rather than
|
||
|
groups of columns to improve the communication to computation
|
||
|
ratio.
|
||
|
(3) The SOR solver in the SPLASH Ocean code has been replaced with a
|
||
|
restricted Red-Black Gauss-Seidel Multigrid solver based on that
|
||
|
presented in:
|
||
|
|
||
|
Brandt, A. Multi-Level Adaptive Solutions to Boundary-Value Problems.
|
||
|
Mathematics of Computation, 31(138):333-390, April 1977.
|
||
|
|
||
|
The solver is restricted so that each processor has as least two
|
||
|
grid points in each dimension in each grid subpartition.
|
||
|
|
||
|
Two implementations are provided in the SPLASH-2 distribution:
|
||
|
|
||
|
(1) Non-contiguous partition allocation
|
||
|
|
||
|
This implementation (contained in the non_contiguous_partitions
|
||
|
subdirectory) implements the grids to be operated on with
|
||
|
two-dimensional arrays. This data structure prevents partitions
|
||
|
from being allocated contiguously, but leads to a conceptually
|
||
|
simple programming implementation.
|
||
|
|
||
|
(2) Contiguous partition allocation
|
||
|
|
||
|
This implementation (contained in the contiguous_partitions
|
||
|
subdirectory) implements the grids to be operated on with
|
||
|
3-dimensional arrays. The first dimension specifies the processor
|
||
|
which owns the partition, and the second and third dimensions
|
||
|
specify the x and y offset within a partition. This data structure
|
||
|
allows partitions to be allocated contiguously and entirely in the
|
||
|
local memory of processors that "own" them, thus enhancing data
|
||
|
locality properties.
|
||
|
|
||
|
The contiguous partition allocation implementation is described in:
|
||
|
|
||
|
Woo, S. C., Singh, J. P., and Hennessy, J. L. The Performance Advantages
|
||
|
of Integrating Message Passing in Cache-Coherent Multiprocessors.
|
||
|
Technical Report CSL-TR-93-593, Stanford University, December 1993.
|
||
|
|
||
|
A detailed description of both versions will appear in the SPLASH-2 report.
|
||
|
The non-contiguous partition allocation implementation is conceptually
|
||
|
similar, except for the use of statically allocated 2-dimensional arrays.
|
||
|
|
||
|
These programs work under both the Unix FORK and SPROC models.
|
||
|
|
||
|
RUNNING THE PROGRAM:
|
||
|
|
||
|
To see how to run the program, please see the comment at the top of the
|
||
|
file main.C, or run the application with the "-h" command line option.
|
||
|
Five command line parameters can be specified, of which the ones which
|
||
|
would normally be changed are the number of grid points in each dimension,
|
||
|
and the number of processors. The number of grid points must be a
|
||
|
(power of 2+2) in each dimension (e.g. 130, 258, etc.). The number of
|
||
|
processors must be a power of 2. Timing information is printed out at
|
||
|
the end of the program. The first timestep is considered part of the
|
||
|
initialization phase of the program, and hence is not included in the
|
||
|
"Total time without initialization."
|
||
|
|
||
|
BASE PROBLEM SIZE:
|
||
|
|
||
|
The base problem size for an upto-64 processor machine is a 258x258 grid.
|
||
|
The default values should be used for other parameters (except the number
|
||
|
of processors, which can be varied). In addition, sample output files
|
||
|
for the default parameters for each version of the code are contained in
|
||
|
the file correct.out in each subdirectory.
|
||
|
|
||
|
DATA DISTRIBUTION:
|
||
|
|
||
|
Our "POSSIBLE ENHANCEMENT" comments in the source code tell where one
|
||
|
might want to distribute data and how. Data distribution has an impact
|
||
|
on performance on the Stanford DASH multiprocessor.
|
||
|
|