Convolutional Neural Networks behave unpredictively when test images differ from the training images, for example when different sequences or acquisition parameters are used. We trained models to generate synthetic CT images, and tested the models on both in-distribution and out-of-distribution input sequences to determine the magnitude of performance loss. Additionally, we evaluated if uncertainty estimates made using dropout-based variational inference could detect spatial regions of failure. Networks tested on out of distribution images failed to generate accurate synthetic CT images. Uncertainty estimates identified spatial regions of failure and increased with the difference between the training and testing sets.