Katherine Wood



Raw data are often unavailable, and all that may remain of a data set are its summary statistics. When these data are integers on a fixed scale, such as Likert-style ratings, and their mean, standard deviation, and sample size are known, it is possible to reconstruct every raw distribution that gives rise to those summary statistics using a system of Diophantine equations. We have developed the open-source program CORVIDS (COmplete Reconstruction of Values In Diophantine Systems) to deterministically reconstruct raw data from summary statistics using this technique. The solutions generated by the program are provably complete. Here we describe the implementation, provide examples and use cases, and prove the correctness of the underlying mathematics. CORVIDS is open-source and available as source code or as stand-alone, user-friendly applications for macOS and Windows.

The paper is available here.