Open Science

Open science is a broad term for various efforts to make both the process and products of scientific research accessible to society at large. This encompasses both "open access" -- the lowering of economic barriers for the accessing of scientific publications and results -- and "open research" -- exposing the process of research to view, not just its traditional products. In the latter category, "open notebook science" aims to place notes, calculations, protocols, and evaluation of interim results into public view in order to allow scientists and the community at large to evaluate not just conclusions, but every step of the process. "Reproducible research" covers efforts to bundle publication of results with the raw data, software algorithms, and calculations needed to reconstruct the published results.

My commitment to open science began with open access, and attempts to ensure that my written output was -- to the extent possible -- available online in freely downloadable format. This is always a work in progress, because older publications are often unavailable given paywalls or commercial licenses by academic publishers. To the extent possible, I will always make versions of publications available online, and I will attempt to choose journals with permissive preprint/postprint policies. I am slowly attempting to reconstruct PDF versions of older conference papers, many of which I have only in print files, which will need to be scanned.

But by far the more important aspect of open science is an open process, and reproducible results. To that end, I have been exploring the use of wikis and blogs to record interim thinking on research topics, and this is the second iteration of an online "lab notebook" that goes beyond occasional blog posting. My first digital lab notebook was a local installation of the Instiki wiki, synchronized with Dropbox. This was useful for doing my own work wherever I happened to be, but was not truly "open" in the sense of public access. I have been migrating some of those reading notes, and topical notebooks to this current iteration, and that process is ongoing. My first "online open notebook" was hosted by Wikispaces, but I found that the lack of offline access was difficult for me, given travel and limited internet access where I live and work.

This current iteration began when I stumbled onto Jekyll and Github Pages, and then learned of Carl Boettiger's sophisticated efforts at open notebook science using these components. My own notebook and reproducible notes are not nearly as advanced as Carl's workflow, but he continues to provide the paradigm toward which I believe many of us are striving.

The community seems to be developing a taxonomy of "open notebook science" efforts, which allows readers to understand what they can expect from an online lab notebook. ONSclaims has two dimensions to its claims classification: completeness and immediacy. "All Content Immediate" indicates a lab notebook, for example, in which the scientist has the entirety of their notes, calculations, and data available immediately as generated. Such a state indicates that "if it isn't in the notebook, others can assume you haven't done it." This is a laudable goal, but since my process and the site are still evolving, I'm claiming a lesser classification, indicated by the icon here: Selected Content Immediate. Some of my manuscripts (including my dissertation text) are outside the online notebook format, and not all of my analyses are yet pipelined in such a way as to make them easy to post, but I'm evolving towards that.

Unless otherwise noted (i.e., on a draft manuscript), notes posted here is made available under the Creative Commons NonCommercial-Attribution-ShareAlike license. This means you are free to make use of it, change it, use it for any non-commercial purposes, as long as you acknowledge the source. Journal manuscripts under development here are often NOT covered by this Creative Commons license, because they will eventually be subject to whatever license the target journal requires. Thus, drafts are readable in their posted form, but all rights are reserved beyond viewing (and, of course, having your own ideas with respect to the material).

Software and tools I write for generating scientific results will always have a free, open-source version available for use by scholars, students, and the community. I'm not saying that I don't write commercial software, or that I won't take research results and find ways to create products. I am saying, however, that if I work on a piece of research, and communicate those results to the community, members of the community need a way to see what I've done, replicate it if desired, refute my claims if I turn out to be wrong, and use those tools and software in their own work to do something better.