What tools enable not just openness but also reproducibility and replication?

+7 votes
174 views
asked Aug 9, 2015 in Open Science by Rex Kerr (95 points)

Closed science, especially in the biological sciences, has been having trouble with reproducibility (and/or replication, if the distinction is important to you). Usually this is not just because the data was analyzed incorrectly, so open data doesn't help much. And if you're trying to reproduce something that's published, you presumably already got the methods etc. regardless of whether it was open access or not.

When labs get different results, the discrepancies often are solved when they can agree to go visit each other and look in detail at protocols and so on.

The difference is in how much information is available when visiting in person compared to the distilled version in the methods paper. With a sufficiently open approach, this information ought to be available to everyone. But writing methods sections is already hassle enough--making much more detail available isn't likely to become widespread unless there are tools that make it easy (or easier to do than not!).

What kind of tools are there for making public not just data and results but the protocols and process that go into generating the data? Are there electronic lab notebooks that have features that enable easy and ongoing exporting of results in some comprehensible fashion? Have any groups who are trying to work openly published or described a set of best practices for making available the gory details of how their research is being conducted?



This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Alexander Konovalov (135 points)
If you think that this thread should be migrated to Academia or another SE site because the OpenScience beta is closing, please edit the list of questions shortlisted for the migration [here](http://meta.openscience.stackexchange.com/questions/73/).

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

3 Answers

+6 votes
answered Aug 9, 2015 by Thomas (915 points)
 
Best answer

I won't get into the semantics here but you're probably talking about "replication" (others recreating results of a study anew) rather than reproducibility (others following along with your original data and code). Your question already highlights two solutions and I can think of at least one other:

  1. Open lab notebooks. Here you would publish everything about your research in a potentially very rough form that discloses everything you did from the beginning to the end of the research. Lab notebooks are common in some (sub)disciplines and not others, so this either involves sharing something you already write or writing something new with the intention to share. Carl Boettiger's notebook is probably the best example of how to achieve this.
  2. Shared protocols. Depending on your field, there are some resources for sharing wet lab protocols (such as SpringerProtocls or Nature's Protocol Exchange). These are actual pseudo-publications that fully document a research producer (independent of its particular output). There are some analogous publication forms in fields like psychometrics where journals might publish an article that simply describes a psychological measurement technique and scaling procedure, but protocol sharing is generally rare in the social sciences.

  3. A third option that you don't mention is preregistration. Here you would document your full protocol and analysis procedure in advance of conducting research. Then, the research process is simply a matter of following the recipe. Anyone else who wanted to replicate your work could simply follow your preregistration plan.



This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Carlisle Rainey (70 points)
Some examples of publicly version controlled projects on GitHub include those from [Zach Jones](https://github.com/zmjones/eeesr), [Brenton Kenkel](https://github.com/brentonk/crpn), and [myself](https://github.com/carlislerainey/priors-for-separation).

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Carlisle Rainey (70 points)
I think of public version control (e.g., via Git/GitHub) as a compromise between more time-consuming open notebook practices of (e.g., [Carl Boettiger](https://github.com/cboettig/labnotebook)) and a final-code-only approach. Zach Jones provides a wonderful justification in this [essay](http://zmjones.com/static/papers/git.pdf). I have some thoughts in this [crude guide](https://github.com/carlislerainey/git-for-political-science/blob/master/git.md).

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Rex Kerr (95 points)
I'm hoping that it would be more practical to blur the line between the two. And I appreciate the third possibility, but I was hoping for more details on _how_ to accomplish (1) or (2) (or (3)) in a feasible way. For instance, how is opening your lab notebook not too much work to bother with, especially in an ongoing study?

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Thomas (915 points)
@RexKerr Well, some would hope that (3) becomes a requirement for publication, so the effort part - in that scenario - would be kind of irrelevant. Like many aspects of open science, opening a notebook or protocol is often best seen as service to yourself first and others second. By being transparent and detailed, you have a complete record of everything you've done in the event that you need it later (to answer someone's question, repeat some aspect of a study, or just try to understand your own work after it's been awhile).

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
+4 votes
answered Aug 10, 2015 by Neil Chue Hong (155 points)

Literate programming approaches are another way - like electronic lab notebooks - that can aid reproducibility, by combining narrative with the analysis (or even simulation code).

Two systems that we've used at the Software Sustainability Institute are:



This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
+1 vote
answered Aug 10, 2015 by Michela Vignoli (10 points)

More lab notebook tools:

More useful tools are listed in this huge Open Science tools list.



This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Rex Kerr (95 points)
Although this is valuable information (thank you!), this is also not a complete answer to the question. Perhaps you will consider fleshing it out a bit more?

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)
commented Aug 18, 2015 by Michela Vignoli (10 points)
Unfortunately I don't know the functionality of the tools myself as I am not working in a lab - I'm mainly doing desk research :) But I agree completely that the availability of tools which make it somewhat easier to publish data & protocols etc. openly are the key to involve more researchers in doing so. You need to try out the tools yourself to see whether they satify this requirement or not...

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Welcome to Open Science Q&A, where you can ask questions and receive answers from other members of the community.

If you participated in the Open Science beta at StackExchange, please reclaim your user account now – it's already here!

e-mail the webmaster

...