POOMA: A C++ Toolkit for High-Performance Parallel Scientific Computing | ||
---|---|---|
Prev | Chapter 3. A Tutorial Introduction | Next |
Many scientific computations are localized, computing an array's value by using neighboring values. Encapsulating this local computation in a stencil can yield faster code because the compiler can determine that all array accesses use the same array. Each stencil consists of a function object and an indication of which neighbors participate in the function's computation.
Example 3-4. Stencil Array Implementation of Doof2d
#include <iostream> // has std::cout, ... #include <stdlib.h> // has EXIT_SUCCESS #include "Pooma/Arrays.h" // has POOMA's Array declarations // Doof2d: POOMA Arrays, stencil implementation // Define a stencil class performing computation.class DoofNinePt { public: // Initialize the constant average weighting. DoofNinePt() : weight(1.0/9.0) {} // This stencil operator is applied to each // interior domain position (i,j). The "C" // template parameter permits use of this // stencil operator with both Arrays & Fields.
template <class C> inline typename C::Element_t operator()(const C& c, int i, int j) const { return weight * (c.read(i+1,j+1)+c.read(i+1,j)+c.read(i+1,j-1)+ c.read(i ,j+1)+c.read(i ,j)+c.read(i ,j-1)+ c.read(i-1,j+1)+c.read(i-1,j)+c.read(i-1,j-1)); } inline int lowerExtent(int) const { return 1; }
inline int upperExtent(int) const { return 1; } private: // In the average, weight elements with this value. const double weight; }; int main(int argc, char *argv[]) { // Prepare the POOMA library for execution. Pooma::initialize(argc,argv); // Ask the user for the number of averagings. long nuAveragings, nuIterations; std::cout < < "Please enter the number of averagings: "; std::cin > > nuAveragings; nuIterations = (nuAveragings+1)/2; // Each iteration performs two averagings. // Ask the user for the number n of values along one // dimension of the grid. long n; std::cout < < "Please enter the array size: "; std::cin > > n; // Specify the arrays' domains [0,n) x [0,n). Interval<1> N(0, n-1); Interval<2> vertDomain(N, N); // Set up interior domains [1,n-1) x [1,n-1) for // computation. Interval<1> I(1,n-2); Interval<2> interiorDomain(I,I); // Create the arrays. // The Array template parameters indicate // 2 dimensions, a 'double' value // type, and ordinary 'Brick' storage. Array<2, double, Brick> a(vertDomain); Array<2, double, Brick> b(vertDomain); // Set up the initial conditions. // All grid values should be zero except for the // central value. a = b = 0.0; // Ensure all data-parallel computation finishes // before accessing a value. Pooma::blockAndEvaluate(); b(n/2,n/2) = 1000.0; // Create a stencil performing the computation.
Stencil<DoofNinePt> stencil; // Perform the simulation. for (int k = 0; k < nuIterations; ++k) { // Read from b. Write to a.
a(interiorDomain) = stencil(b, interiorDomain); // Read from a. Write to b. b(interiorDomain) = stencil(a, interiorDomain); } // Print out the final central value. Pooma::blockAndEvaluate(); // Ensure all computation has finished. std::cout < < (nuAveragings % 2 ? a(n/2,n/2) : b(n/2,n/2)) < < std::endl; // The arrays are automatically deallocated. // Tell the POOMA library execution has finished. Pooma::finalize(); return EXIT_SUCCESS; }
Before we describe how to create a stencil, we describe how to apply a stencil to an array, yielding computed values. To compute the value associated with index position (1,3), the stencil's center is placed at (1,3). The stencil's upperExtent and lowerExtent functions indicate which Array values the stencil's function will use. See Figure 3-3. Applying the stencil's function call operator() yields the computed value. To compute multiple Array values, apply a stencil to the array and a domain object: stencil(b, interiorDomain). This applies the stencil to each position in the domain. The user must ensure that applying the stencil does not access nonexistent Array values.
Figure 3-3. Applying a Stencil to an Array
To compute the value associated with index position (1,3) of an array, place the stencil's center, indicated with dashed lines, at the position (1,3). The computation involves the array values covered by the array and delineated by upperExtent and lowerExtent.
To create a stencil object, apply the Stencil type to a function object class. For example, Stencil<DoofNinePt> stencil declares the stencil object. The function object class must define a function call operator() with a container parameter and index parameters. The number of index parameters, indicating the stencil's center, must equal the container's dimension. For example, DoofNinePt defines operator()(const C& c, int i, int j). We templated the container type C although this is not strictly necessary. The two index parameters i and j ensure the stencil works with two-dimensional containers. The lowerExtent function indicates how far to the left (or below) the stencil extends beyond its center. Its parameter indicates a particular dimension. Index parameters i and j are in dimension 0 and 1. upperExtent serves an analogous purpose. The POOMA Toolkit uses these functions when distributing computation among various processors, but it does not use these functions to ensure nonexistent Array values are not accessed. Caveat stencil user!