“Gold standard” for CNV detection methods?

Is there a generally accepted "gold standard" for testing the performance of CNV detection methods? I'm interested both in learning about existing datasets that may serve as gold standards for CNV detection, as well in methods for producing such datasets in the first place.

(In case it matters, I'm primarily interested in the evaluation of CNV detection methods in laboratory mice.)