It would be strong if you could see the reason for the differences
It would be pretty powerful if you could not only compare quietly beside production, but also see why the differences happened. Also, it’d be nice to have a feature that lets you nurture failure cases afterward and replay them yourself, plus a view that summarizes results in parallel like a sidekick.