What’s going on related to CEDAR?

CEDAR Offers Support for CDEs from caDSR

We are pleased to report that CEDAR template creators can now import from over 60,000 of NCI’s caDSR Common Data Elements (CDEs) to build new Fields in CEDAR. Using CEDAR’s search, browsing, and viewing services template builders can easily build a form based partly or entirely on CDEs from caDSR.

Over the last 18 months, CEDAR developers have collaborated with the NCI to adapt CEDAR capabilities to the unique characteristics of CDEs. By representing these CDEs as CEDAR Fields, we have made them fully accessible to CEDAR users. CEDAR already handled many of the specialized features that are found in the caDSR templates, and the CEDAR team added some features to support particular CDE workflows.

Most attributes from the data elements in the CDE browser can be represented directly in CEDAR, especially the attributes used in creating templates.

User Applications

CEDAR’s easy-to-use system for working with metadata templates and reusable components work with the imported CDEs in several ways. A CEDAR Template creator may import CDE representations into a Template using CEDAR’s Template Designer import process, in this way building a metadata form based partly or entirely on CDE content from caDSR.

All the resource discovery and lookup features in CEDAR also work for CDEs. So you can use CEDAR’s search, browsing, and viewing services to find and review CDE content in the system. And CEDAR’s REST APIs also work for the CDEs in CEDAR, which means users can remotely discover and download CDE content.

CEDAR users can even build new Fields by making a copy of any of the CDE-based Fields that are already in CEDAR. The user can modify this copy—which at this point is a generic CEDAR Field—however he or she wants, and will inherit any other Field values like the label, description and help tip. (Eventually CEDAR might support re-submission of CDEs to a CDE repository, but this is not offered at this time.)

About Common Data Elements

CDEs offer precise specifications of questions, including the set of allowable answers to each question. Generally following ISO 11179 data standards, CDEs are decribed in great detail, including information about their development history. CDEs are increasingly being adopted to help improve standardization and interperability, but while CDEs can provide a strong conceptual foundation for interoperation, there are no widely recognized serialization or interchange formats to describe and exchange their definitions.

CDE registries can help standardize the way CDEs are collected, stored, transferred, and reported. One of the largest CDE registries has been developed by the U.S. National Cancer Institute (NCI) with the goal of facilitating multidisciplinary, multi-institutional cancer research. This registry is called the Cancer Data Standards Repository (caDSR) and it contains over 60,000 CDEs that cover many aspects of cancer research. The U.S. National Institutes of Health (NIH) are also developing a multi-discipline registry that aims to unify the range of biomedical CDEs that have been produced by a variety of NIH and other organizations (https://cde.nlm.nih.gov).

How CEDAR Adopts CDEs from caDSR

To make existing CDEs more readily accessible to form builders, we extended our CEDAR Web-based metadata management platform to provide a core representation of CDEs suitable for specifying questions in a metadata acquisition system. We do not manage the entire CDE specification—that contains a comprehensive implementation of the ISO/IEC 11179 standard—but focus instead on core functionality that specifies the questions and the values used to answer those questions.

By importing the XML-defined CDEs from the caDSR system into JSON Schema-defined fields in CEDAR, we made these specifications available to any CEDAR user. We run the conversion process automatically to keep the CEDAR CDEs up-to-date with respect to the source content.

CEDAR captures the CDE’s field information (top), and puts the value set information into BioPortal, a repository of vocabularies and ontologies. The field specification of the uploaded CEDAR caDSR CDE (top right) references the versioned value set in BioPortal (bottom right).

Additional Information

You can find more information about CEDAR and its use of CDEs in the following resources.

CEDAR in the GO FAIR Funder Study

FAIR Funder Implementation Study: life cycle with founding members

After providing contributions to the GO FAIR project over the last 18 months, CEDAR will be a significant participant in GO FAIR’s FAIR Funder Implementation Study.

This collaborative project will demonstrate a new level of integrated and FAIR metadata, making data projects funded by research agencies demonstrably more Findable, Accessible, Interoperable, and Reusable. As one of the founding collaborators, CEDAR has played a significant role in defining, describing, and implementing services that will improve metadata collection for funded research.

CEDAR’s Role

The CEDAR project provides a way for funders to specify what metadata they want to collect as part of the research life cycle. This can include not just logistical metadata the applicants may provide to describe their proposed project (title, investigators, summary, costs, duration), but metadata describing how they will manage and document their research products—their Data Management Plan or Data Stewardship Plan—and pointers to those products when they have been released. Grantees will be able to specify this metadata in simple forms with clear instructions throughout their execution of the grant, so that funders and other potential users can find and reuse the described data products.

Furthermore, in the FAIR Funder Implementation Study, the supplied metadata can be evaluated to see whether it meets criteria for FAIRness, for example having persistent unique identifiers that can be resolved. The grantees and funders can rely on automated evaluation systems to obtain the metadata, perform assessments of it, and issue reports to the grantees and funders of the described projects. This enables the grantees to easily provide provably FAIR metadata and data, while community members can see, understand, and reuse the best practices the metadata represents

Coming Soon

OpenView of FAIR Funder template in outline formIn earlier workshops to work on funder metadata, the CEDAR team helped funders describe a basic set of metadata fields describing products throughout the funded life cycle. In coming Metadata for Machine (M4M) workshops, this simple example will be enhanced and customized to align it with the needs of the funders who are early adopters of the GO FAIR methods. The FAIR Funder Implementation Study will demonstrate the CEDAR template’s application in creating metadata throughout the life cycle, including evaluating the resulting metadata for FAIRness with external evaluation software.

Going beyond the CEDAR demonstrations, other founding systems like the Data Stewardship Wizard and Castor will demonstrate their own ability to perform metadata capture and reuse within the Implementation Study, and will demonstrate interoperation with CEDAR using common specifications to exchange templates and metadata. Meanwhile, templates and components that are useful for others will be registered in FAIRsharing to so that they can be easily found and evaluated for reuse.

CEDAR Release 2.4

We released version 2.4 of the CEDAR Workbench on September 6, providing more user features and enhancements.

OpenView offers public option for CEDAR artifacts

OpenView of metadata instanceDid you ever want to show your template or metadata values to a colleague, without logging in? Do you want to view all your metadata on the web? Or maybe you’d like an IRI that anyone can use to see your work?

Now you can make your CEDAR artifact—metadata instance, template, element, or field—visible on the web. CEDAR’s OpenView service presents the CEDAR artifact as a publicly visible web page, with pop-up metadata descriptions and access to JSON and RDF views of the content. To make public your template, element, or field, simply enable OpenView from the workspace menu for the artifact. For now, if you want to make your metadata public, the template it’s based on must also be public—we can help you with this.

Instructions for CEDAR’s OpenView feature may be found at its CEDAR manual page.

Find field names in templates, elements, and fields

Adding to CEDAR’s ability to search for field names in CEDAR instances, you can now search for field names in CEDAR templates, elements, and fields.

Just like searching for field names in metadata instances, a colon after the string indicates a search within field names. The syntax ‘namestring:’ in CEDAR’s search bar will find the templates containing ‘namestring’ in the title. In the CEDAR search syntax, an asterisk matches any string and a question mark matches any character:

  • title:*, or simply title: Search for fields containing "Entity Identifier"
  • Publish*:
  • "Contact Email":
  • to?ic:

This functionality is documented in more detail here.

AIRR Community NCBI Pipeline

We improved the CAIRR pipeline for submitting MiAIRR data to NCBI. The AIRR community has documented CEDAR-driven MiAIRR submissions to NCBI in the MiAIRR-to-NCBI Submission Manual, the primary user documentation for submitting AIRR metadata. In the SRA section, the pipeline now checks user-entered file names against the names of files actually submitted, and alerts the user if they do not match, and file type options have been updated to reflect NCBI expectations. Finally, members of the AIRR community have validated that submissions appear appropriately in NCBI repositories.

Added Human Tissue NCBI Pipeline

CEDAR added a second NCBI submisssion pipeline for submitting metadata on Human Tissue studies. The template for this pipeline is modeled on the AIRR Community pipeline, but is customized to the NCBI BioSample Human Package 1.0.

Submission to Repository window with NCBI Human Tissue selected as defaultThe template and template elements used by this pipeline are publicly available in the following CEDAR folder: All/Shared/Shared by CEDAR/CEDAR-to-NCBI Pipeline. Documentation of the pipeline may be found in the CEDAR pipelines documentation page.

Inside News

Categories for Artifacts

To handle categorization of CDE fields, we have designed a category system that can be used by different communities to categorize CEDAR artifacts according to their own hierarchical labels. The API for this system has been implemented, and its user interface is planned for the next release.

Performance

We improved CEDAR’s performance for more search types in BioPortal class hierarchies.

Inside Inside News

We refactored code to handle resources (artifacts, users, groups, and folders) in a uniform way. And this release also incorporates some bug fixes.

All the tasks completed for the 2.4 release series can be found with this GitHub search, or by visiting the GitHub release page for release-2.4.