The performance of supervised machine learning tools is only as good as the data used to train them. In this work, we investigate the impact of training data distribution on tissue microstructure estimates in the human brain. We focus on two strategies: uniform sampling from the entire parameter space and sampling from parameter combinations observed using traditional model fitting. We demonstrate that training on previously observed combinations may be advantageous for detecting small variations in healthy tissue. However, for detecting atypical tissue abnormalities, our results favour uniform training data sampling in which all plausible parameter combinations are represented.