Meeting Banner
Abstract #2630

nuFFTW: A Parallel Auto-Tuning Library for Performance Optimization of the NuFFT

Mark Murphy1, Michal Zarrouk2, Kurt Keutzer2, Michael Lustig2

1Google, Mountain View, CA, United States; 2EECS, UC Berkeley, Berkeley, CA, United States


We present a fast, autotuned, Gridding-based non-uniform FFT library with parallel implementions on CPUs and GPUs for reconstructing from non-Cartesian data. The influence of a nuFFT implementation and parameter selection on the resulting runtime is non-trivial. Our auto-tuning approach empirically selects an optimal implementation per trajectory by searching over algorithms and parameters, and saves it for future reconstructions (i.e. parallel imaging). We show that the optimal implementation depends also on the target platform and the sampling pattern itself. We also present a heuristic for near-optimal selection when exhaustive search is prohibitively expensive.