This is a very spontaneously written quick post inspired, or rather forced out, by the release of some 70 000 OKCupid users’ data (which I learned about yesterday). The data was released by a Danish self-acclaimed researcher and scientist, actually just a graduate student (I say ‘just’ not because I consider graduate students to be lesser human beings, but because he obviously was not aware nor acting according to the professional codes of the research community, i.e. wasn’t a fully trained scientist). The data has since been removed from the Open Science Framework repository.
What I’m about to say next is a general remark on the current culture among the scientific community, rather than an analysis of the individual case that is the OKCupid-gate.
In a way this feels like an accident waiting to happen. The discussions concerning research integrity and ethics have been lagging far behind the progresses of Big Data, Science 2.0 and Open Science. Both Science 2.0 and Open Science have so far mostly been playgrounds of natural scientists. Yes, there are the emerging fields of digital humanities and computational social sciences, but despite the buzz they remain marginal. Most of the human scientists applying computational methods and digital sources to human science research questions are having to go pretty DIY on their workflows, both in terms of practical and theoretical methods. It is not my intention to put natural and human scientists up against each other saying that one is more ethically responsible than the other. What I am saying is that human scientists have a different, and research-wise deeper, understanding of all things human and social. It’s their job, after all. They are better equipped to understand the 50 shades of open in social media and see the potential harm that personal information “that is already openly available” can do if it’s released as open data.
All these discussions, about computational methods, Open Science, Science 2.0, Web 2.0, research integrity, natural sciences, human sciences etc., are going on in their separate bubbles. It is a terribly slow and wasteful way to proceed. I say it’s about time to start bursting these bubbles. First of all, we need to stop referring only to natural sciences as “Science” (yes, Anglo-Saxon world, I’m looking at you) and make the concept also include human sciences. This would help us to acknowledge that there are certain skills and lessons that every researcher, or scientist, no matter whether they are studying Big Bang the historical event or Big Bang Theory the sitcom, needs to learn. Human scientists have to start acquiring basic computational and research data science skills, and natural scientists need to better understand how their work relates to societal issues.
We should also break the Open Science bubble and make openness (as in accessibility and transparency) prerequisite for good science. This would maybe finally rid us of the weird idea some people seem to have (among them both advocates and opponents of Open Science) about openness being equal to vomiting content to the web, without giving a second thought to issues such as quality (metadata, licensing) or privacy (the OKCupid case).