Monday, March 06, 2006

Replication studies and Captain Ed's blogswarm 

Ed Morrissey and I were getting ready for the NARN broadcast Saturday and I read his post on the Gitmo detainees. In it he refers to a study by two lawyers, one a professor of law at Seton Hall, which analyzes the cases against 517 detainees (which Ed thinks is the vast majority of those still there. I'll take his word for that.) The general conclusion of the study is that over half of the detainees are not found to have committed hostile acts, suggesting that they are improperly detained.

Since it was fresh Saturday news we took it on the air, and as is my wont I am talking and reading at the same time, reading the study. When Ed said the data they used was now posted at the DoD site, I suggested that the best way to determine if the study is right is to replicate it. Ed subsequently has called for a blogswarm to do just that, a project you can still volunteer for.

Professionial journals like these (forgive the parochial nature of my selections -- these are the journals I know) require replicable results, and has been a "best practice" in my view for many years. Gary King suggests a model policy.
Authors submitting quantitative papers to this journal for review must address the issue of data availability and replication in their first footnote. Authors are ordinarily expected to indicate in this footnote in which public archive they will deposit the information necessary to replicate their numerical results, and the date when it will be submitted. The information deposited should include items such as original data, specialized computer programs, lists of computer program recodes, extracts of existing data files, and an explanatory file that describes what is included and explains how to reproduce the exact numerical results in the published work.

Obviously the paper in question hasn't been published yet, and I don't expect papers in progress to post their data, since they still need to get the benefit of publication first. However, in this case we have public data, and the paper has some notes that would allow one to get a grip on how they should be coding the data.

In essence, Ed's blogswarm is a call for refereeing the paper by the blogosphere, which is an interesting experiment. Ed's provision of coding forms and making them public is helpful, and what should happen next is a response from the Denbeauxs to verify or refute the codings.

Categories: , , ,