Dr. Mark Musen posts on "Tragedy of the (Data) Commons"

November 6, 2017

"We need a comprehensive approach to the authoring and management of metadata, as the success of a lot of commons projects is at stake."

Dr. Mark Musen writes in a blog post for the National Cancer Informatics Program "To make experimental datasets FAIR, they must be accompanied by metadata" that can explain them—and many online datasets fail to meet this need. He says that this is "the challenge for existing online data repositories — and the challenge for all the data-commons initiatives."

Dr. Musen is the Principal Investigator of the CEDAR project, and argues that many of the tools and resources like the CEDAR Workbench can be harnessed in the service of better metadata. The solution is not simply technical, he suggests. "Without community-based interventions, we should expect analogous degradation in the value of a data commons, as the repository becomes filled with increasing numbers of datasets that have metadata that are confusing, conflicting, or incomplete … when investigators act in their own self-interest, taking short cuts to generate metadata as quickly as possible, we should expect that the overall utility of the resource will decline.

Read the rest of Dr. Musen's post at the NCIP web site.

