POOMA: A C++ Toolkit for High-Performance Parallel Scientific Computing | ||
---|---|---|
Prev | Chapter 2. Programming with Templates | Next |
POOMA uses C++ templates to support type polymorphism without incurring any run-time cost as a program executes. All template operations are performed at compile time by the compiler.
Prior to the introduction of templates, almost all of a program's interesting computation occurred when it was executed. When writing the program, the programmer, at programming time, would specify which statements and expressions will occur and which types to use. At compile time, the compiler would convert the program's source code into an executable program. Even though the compiler uses the types to produce the executable, no interesting computation would occur. At run time, the resulting executable program would actually perform the operations.
The introduction of templates permits interesting computation to occur while the compiler produces the executable. Most interesting is template instantiation, which produces a type at compile time. For example, the Array "type" definition requires template parameters Dim, T, and EngineTag, specifying its dimension, the type of its values, and its Engine type. To use this, a programmer specifies values for the template parameters: Array<2,double,Brick> specifies a dimension of 2, a value type of double, and the Brick Engine type. At compile time, the compiler creates a type definition by substituting the values for the template parameters in the templatized type definition. The substitution is analogous to the run-time application of a function to specific values.
All computation not involving run-time input or output can occur at programming time, compile time, or run time, whichever is more convenient. At programming time, a programmer can perform computations by hand rather than writing code to compute it. C++ templates are Turing-complete so they can compute anything computable. Unfortunately, syntax for compile-time computation is more difficult than for run-time computation. Also current compilers are not as efficient as code executed by hardware. Run-time C++ constructs are Turing-complete so using templates is unnecessary. Thus, we can shift computation to the time which best trades off the ease of expressing syntax with the speed of computation by programmer, compiler, or computer chip. For example, POOMA uses expression template technology to speed run-time execution of data-parallel statements. The POOMA developers decided to shift some of the computation from run-time to compile-time using template computations. The resulting run-time code runs more quickly, but compiling the code takes longer. Also, programming time for the POOMA developers increased significantly, but, most users, who are usually most concerned about decreasing run times, benefited.