FFT Selection at Runtime
The new generic FFT implementation (GFFT) depends on a constant parameter P, specified before compilation. What should you do, if length of the data will be known at runtime at first? Write a big switch? No. It would not really solve the problem. To overcome compile-runtime parameter definition, I apply object factory, as described in Alexandrescu's Modern C++ Design
[1], Chapter 8. The source code of the template class Factory
in header Factory.h is part of the Loki library supplementing the book as well as added to the full source code of this article.
The idea is quite simple: All the template classes with varying template parameters are inherited from a single base class, such as AbstractFFT
in Listing Seven, where necessary member functions are declared as virtual. The base class is passed to GFFT as an additional template parameter FactoryPolicy
. It allows you also to substitute an empty base class EmptyFFT
without virtual function declaration and thus avoid the virtual function call penalty. This is the default case, if GFFT is used without object factory. GFFT receives also a unique identification number id=P and a static object generation function Create()
to conform to the object factory requirements.
Listing Seven
template<typename T> class AbstractFFT { public: virtual void fft(T*) = 0; }; class EmptyFFT { }; template<unsigned P, typename T=double, class FactoryPolicy=EmptyFFT> class GFFT:public FactoryPolicy { // ... public: enum { id = P }; static FactoryPolicy* Create() { return new GFFT<P,T,FactoryPolicy>(); } // ... };
Each GFFT template class that should be available at runtime is registered in the object factory by its identification number using the FactoryInit
template metaprogram from Listing Eight. The initializer receives template classes in the form of a Typelist
([1], Chapter 3). Template metaprogram GFFTList
(Listing Eight) constructs such a Typelist
, receiving an FFT template class, e.g. GFFT, as well as the first and the last value of P. Thus, the creation of the FFT object factory in a program looks like this:
Loki::Factory<AbstractFFT<double>,unsigned int> gfft_factory;
The new factory bears information about the base class, but is still empty. Metaprograms GFFTList
and FactoryInit
help to add the necessary GFFT template classes into the factory writing a single line:
FactoryInit<GFFTList<GFFT,10,27> ::Result>::apply(gfft_factory);
The factory contains now all the GFFT implementations from P=10 to P=27. To create a needed object instance at runtime on demand and to run the FFT, type:
AbstractFFT<double>* gfft = gfft_factory.CreateObject(P); gfft->fft(data);
Listing Eight
template<class TList> struct FactoryInit; template<class H, class T> struct FactoryInit<Loki::Typelist<H,T> > { template<class Fact> static void apply(Fact& f) { f.Register(H::id,H::Create); FactoryInit<T>::apply(f); } }; template<> struct FactoryInit<Loki::NullType> { template<class Fact> static void apply(Fact&) { } }; template< template<unsigned,class,class> class FFT, unsigned Begin, unsigned End, typename T=double, class FactoryPolicy=AbstractFFT<T> > struct GFFTList { typedef Loki::Typelist<FFT<Begin,T,FactoryPolicy>, typename GFFTList<FFT,Begin+1,End,T, FactoryPolicy>::Result> Result; }; template< template<unsigned,class,class> class FFT, unsigned End, typename T, class FactoryPolicy> struct GFFTList<FFT,End,End,T,FactoryPolicy> { typedef Loki::NullType Result; };
As a result, the GFFT obtains a kind of policy-driven design, which makes the implementation very flexible concerning base class. The same procedure can be applied to other independent FFT components, for example, to the element reindexing (function scramble) and the Danielson-Lanczos section.
The element reindexing could also be implemented using recursive template classes. But to preserve only one template class definition per recursion level, the handled indexes (original and bit-reverse) must be non-constant and passed as parameters to some member function. Tests of such implementation showed the same performance and there is no reason to load compiler with another template recursion and to waste the compile-time.