CEDAR Research Products

CEDAR aims to accelerate biomedical research by improving its metadata. CEDAR plans not only to make biomedical metadata better, but also to make creating it easier and faster. Better metadata will improve our ability to understand and replicate studies, improve discovery of relevant studies, and improve interoperability of study data across repositories and analytical systems.

CEDAR’s resulting collection of metadata, that CEDAR has aligned using its study models and specifications, will also create direct opportunities for biomedical research. The collection will make it faster and easier to explore simple questions and hypotheses, and will enable users to discover studies using a common model for metadata access. Here we describe how the metadata pipeline turns into better, and newly possible, research products.

Applicable Research Products

We provide below a general research scenario that CEDAR will be able to target. And in lieu of a description of resulting research—since CEDAR is not yet fully built—we offer examples of past research products that CEDAR could have accelerated.

Last Updated: 
Jan 31 2016 - 10:10pm
A research lab wants to find data sets across a large number of different repositories that relate to particular condition. For example, if studying influenza infection, how can it find all data sets that relate to that concept in all the relevant repositories? In short, this scenario calls for finding enough quality datasets to support integrated... Read Complete Scenario
Although immune system suppression therapies have improved the acceptance period of transplanted organs to some degree, still many organs are rejected over longer periods. To minimize the rejection rate, one strategy in a recent paper in The Journal of Experimental Medicine studied genetic markers from different kinds of transplants, looking for... Read Complete Scenario
Sepsis is a syndrome of systemic inflammation in response to infection. It kills about 750,000 people in the United States every year (1), and is also the single most expensive condition treated in the United States, costing the healthcare system more than $20 billion annually. Prompt diagnosis and treatment is essential to save lives, but there... Read Complete Scenario
Today many pharmaceutical drugs have been developed, often to treat a particular disease.  Because licensing a drug requires such expensive and lengthy testing, it is difficult to create a new drug and get it approved, so existing disease treatment options may be few and unsatisfactory. However, we know that many drugs can be effective for... Read Complete Scenario

The Challenges

Existing challenges in this effort include dealing with the number and range of different repositories, with all their different interfaces, metadata models, and terminologies; finding data sets from repositories whose metadata is too poorly structured to allow effective search; finding data sets that have been described with terms that are not well defined, either because they are not sufficiently unique to be confidently used, or because the terms are not commonly used with the intended meaning ; and weeding out datasets that have similarly expressed terms, but are not in fact about the same thing.

CEDAR’s Contribution

Each of the challenges above are addressed by one or more CEDAR features or strategies. We briefly outline those CEDAR responses here; some are straightforward, and others require long-term or challenging work and community engagement. We encourage you to discuss any questions with the CEDAR team, for example by contacting us through this site.

Challenge CEDAR Response
Multiple repositories

Be able to publish metadata records to the most common and critical repositories

Providing repository-centric features

Differing repository interfaces

Effective interface development

Buy-in and support from repositories

Differing repository models and terminologies

Effective mappings from CEDAR entities

Templates defined to match repository needs

Finding data sets given poorly structured metadata

Improve rigor of metadata definition

Improve mapping of metadata content

Finding data sets given poorly defined terms

Thorough integration with well-defined terminologies, in defining and using templates

Validation of defined metadata against required vocabularies

Mapping of poorly defined terms to more rigorous terms

Avoiding data sets with ‘false match’ terms

Encourage use of precisely specified terms (IRIs)

Identify deceptive terms (through analytics) and recommend improvements to their holders