Modern approaches to iterative imaging, such as model-based reconstruction, requires efficient implementations of Non-Uniform Fourier Transform to reach feasible reconstruction times. In addition, low-rank subspace projection is often used to reduce computational burden. While many implementations of NUFFT currently exists, they are not optimized for this kind of problems. Here, we propose a fast and memory-efficient NUFFT operator with embedded low-rank subspace projection. We demonstrate an order of magnitude of speed-up with comparable image quality compared to other high-level implementations.