Deep unrolled models have recently shown state-of-the-art performance for reconstruction of dynamic MR images. However, training these networks via backpropagation is limited by intensive memory and compute requirements to calculate gradients and store intermediate activations per layer. Motivated by these challenges, we propose an alternative training method by greedily relaxing the training objective. Our approach splits the end-to-end network into decoupled network modules, and optimizes each module separately, avoiding the need to compute end-to-end gradients. We demonstrate that our method outperforms end-to-end backpropagation by 3.3 dB in PSNR and 0.025 in SSIM with the same memory footprint.