Happenings Archives - Metadata Center

CEDAR Offers Support for CDEs from caDSR

December 11, 2019/in News /by jgraybeal

We are pleased to report that CEDAR template creators can now import from over 60,000 of NCI’s caDSR Common Data Elements (CDEs) to build new Fields in CEDAR. Using CEDAR’s search, browsing, and viewing services template builders can easily build a form based partly or entirely on CDEs from caDSR.

Over the last 18 months, CEDAR developers have collaborated with the NCI to adapt CEDAR capabilities to the unique characteristics of CDEs. By representing these CDEs as CEDAR Fields, we have made them fully accessible to CEDAR users. CEDAR already handled many of the specialized features that are found in the caDSR templates, and the CEDAR team added some features to support particular CDE workflows.

Most attributes from the data elements in the CDE browser can be represented directly in CEDAR, especially the attributes used in creating templates.

User Applications

CEDAR’s easy-to-use system for working with metadata templates and reusable components work with the imported CDEs in several ways. A CEDAR Template creator may import CDE representations into a Template using CEDAR’s Template Designer import process, in this way building a metadata form based partly or entirely on CDE content from caDSR.

All the resource discovery and lookup features in CEDAR also work for CDEs. So you can use CEDAR’s search, browsing, and viewing services to find and review CDE content in the system. And CEDAR’s REST APIs also work for the CDEs in CEDAR, which means users can remotely discover and download CDE content.

CEDAR users can even build new Fields by making a copy of any of the CDE-based Fields that are already in CEDAR. The user can modify this copy—which at this point is a generic CEDAR Field—however he or she wants, and will inherit any other Field values like the label, description and help tip. (Eventually CEDAR might support re-submission of CDEs to a CDE repository, but this is not offered at this time.)

About Common Data Elements

CDEs offer precise specifications of questions, including the set of allowable answers to each question. Generally following ISO 11179 data standards, CDEs are decribed in great detail, including information about their development history. CDEs are increasingly being adopted to help improve standardization and interperability, but while CDEs can provide a strong conceptual foundation for interoperation, there are no widely recognized serialization or interchange formats to describe and exchange their definitions.

CDE registries can help standardize the way CDEs are collected, stored, transferred, and reported. One of the largest CDE registries has been developed by the U.S. National Cancer Institute (NCI) with the goal of facilitating multidisciplinary, multi-institutional cancer research. This registry is called the Cancer Data Standards Repository (caDSR) and it contains over 60,000 CDEs that cover many aspects of cancer research. The U.S. National Institutes of Health (NIH) are also developing a multi-discipline registry that aims to unify the range of biomedical CDEs that have been produced by a variety of NIH and other organizations (https://cde.nlm.nih.gov).

How CEDAR Adopts CDEs from caDSR

To make existing CDEs more readily accessible to form builders, we extended our CEDAR Web-based metadata management platform to provide a core representation of CDEs suitable for specifying questions in a metadata acquisition system. We do not manage the entire CDE specification—that contains a comprehensive implementation of the ISO/IEC 11179 standard—but focus instead on core functionality that specifies the questions and the values used to answer those questions.

By importing the XML-defined CDEs from the caDSR system into JSON Schema-defined fields in CEDAR, we made these specifications available to any CEDAR user. We run the conversion process automatically to keep the CEDAR CDEs up-to-date with respect to the source content.

CEDAR captures the CDE’s field information (top), and puts the value set information into BioPortal, a repository of vocabularies and ontologies. The field specification of the uploaded CEDAR caDSR CDE (top right) references the versioned value set in BioPortal (bottom right).

Additional Information

You can find more information about CEDAR and its use of CDEs in the following resources.

Working with CDEs, in the CEDAR User Guide.
Unleashing the value of Common Data Elements through the CEDAR Workbench. Published in Proceedings of AMIA 2019 Annual Symposium, 681-690.
Slides corresponding to AMIA 2019 paper.
List of additional CEDAR references.

CEDAR in the GO FAIR Funder Study

November 5, 2019/in Happenings, News /by jgraybeal

FAIR Funder Implementation Study: life cycle with founding members

After providing contributions to the GO FAIR project over the last 18 months, CEDAR will be a significant participant in GO FAIR’s FAIR Funder Implementation Study.

This collaborative project will demonstrate a new level of integrated and FAIR metadata, making data projects funded by research agencies demonstrably more Findable, Accessible, Interoperable, and Reusable. As one of the founding collaborators, CEDAR has played a significant role in defining, describing, and implementing services that will improve metadata collection for funded research.

CEDAR’s Role

The CEDAR project provides a way for funders to specify what metadata they want to collect as part of the research life cycle. This can include not just logistical metadata the applicants may provide to describe their proposed project (title, investigators, summary, costs, duration), but metadata describing how they will manage and document their research products—their Data Management Plan or Data Stewardship Plan—and pointers to those products when they have been released. Grantees will be able to specify this metadata in simple forms with clear instructions throughout their execution of the grant, so that funders and other potential users can find and reuse the described data products.

Furthermore, in the FAIR Funder Implementation Study, the supplied metadata can be evaluated to see whether it meets criteria for FAIRness, for example having persistent unique identifiers that can be resolved. The grantees and funders can rely on automated evaluation systems to obtain the metadata, perform assessments of it, and issue reports to the grantees and funders of the described projects. This enables the grantees to easily provide provably FAIR metadata and data, while community members can see, understand, and reuse the best practices the metadata represents

Coming Soon

OpenView of FAIR Funder template in outline form In earlier workshops to work on funder metadata, the CEDAR team helped funders describe a basic set of metadata fields describing products throughout the funded life cycle. In coming Metadata for Machine (M4M) workshops, this simple example will be enhanced and customized to align it with the needs of the funders who are early adopters of the GO FAIR methods. The FAIR Funder Implementation Study will demonstrate the CEDAR template’s application in creating metadata throughout the life cycle, including evaluating the resulting metadata for FAIRness with external evaluation software.

Going beyond the CEDAR demonstrations, other founding systems like the Data Stewardship Wizard and Castor will demonstrate their own ability to perform metadata capture and reuse within the Implementation Study, and will demonstrate interoperation with CEDAR using common specifications to exchange templates and metadata. Meanwhile, templates and components that are useful for others will be registered in FAIRsharing to so that they can be easily found and evaluated for reuse.

CEDAR Release 2.4

September 9, 2019/in Happenings, Releases /by jgraybeal

We released version 2.4 of the CEDAR Workbench on September 6, providing more user features and enhancements.

OpenView offers public option for CEDAR artifacts

Did you ever want to show your template or metadata values to a colleague, without logging in? Do you want to view all your metadata on the web? Or maybe you’d like an IRI that anyone can use to see your work?

Now you can make your CEDAR artifact—metadata instance, template, element, or field—visible on the web. CEDAR’s OpenView service presents the CEDAR artifact as a publicly visible web page, with pop-up metadata descriptions and access to JSON and RDF views of the content. To make public your template, element, or field, simply enable OpenView from the workspace menu for the artifact. For now, if you want to make your metadata public, the template it’s based on must also be public—we can help you with this.

Instructions for CEDAR’s OpenView feature may be found at its CEDAR manual page.

Find field names in templates, elements, and fields

Adding to CEDAR’s ability to search for field names in CEDAR instances, you can now search for field names in CEDAR templates, elements, and fields.

Just like searching for field names in metadata instances, a colon after the string indicates a search within field names. The syntax ‘namestring:’ in CEDAR’s search bar will find the templates containing ‘namestring’ in the title. In the CEDAR search syntax, an asterisk matches any string and a question mark matches any character:

title:*, or simply title:
Publish*:
"Contact Email":
to?ic:

This functionality is documented in more detail here.

AIRR Community NCBI Pipeline

We improved the CAIRR pipeline for submitting MiAIRR data to NCBI. The AIRR community has documented CEDAR-driven MiAIRR submissions to NCBI in the MiAIRR-to-NCBI Submission Manual, the primary user documentation for submitting AIRR metadata. In the SRA section, the pipeline now checks user-entered file names against the names of files actually submitted, and alerts the user if they do not match, and file type options have been updated to reflect NCBI expectations. Finally, members of the AIRR community have validated that submissions appear appropriately in NCBI repositories.

Added Human Tissue NCBI Pipeline

CEDAR added a second NCBI submisssion pipeline for submitting metadata on Human Tissue studies. The template for this pipeline is modeled on the AIRR Community pipeline, but is customized to the NCBI BioSample Human Package 1.0.

The template and template elements used by this pipeline are publicly available in the following CEDAR folder: All/Shared/Shared by CEDAR/CEDAR-to-NCBI Pipeline. Documentation of the pipeline may be found in the CEDAR pipelines documentation page.

Inside News

Categories for Artifacts

To handle categorization of CDE fields, we have designed a category system that can be used by different communities to categorize CEDAR artifacts according to their own hierarchical labels. The API for this system has been implemented, and its user interface is planned for the next release.

Performance

We improved CEDAR’s performance for more search types in BioPortal class hierarchies.

Inside Inside News

We refactored code to handle resources (artifacts, users, groups, and folders) in a uniform way. And this release also incorporates some bug fixes.

All the tasks completed for the 2.4 release series can be found with this GitHub search, or by visiting the GitHub release page for release-2.4.

CEDAR Release 2.3

August 1, 2019/in Happenings, Releases /by jgraybeal

We released version 2.3 of the CEDAR Workbench on May 1, offering significant new user features and internal capabilities.

New Value Recommender Service

Shows example recommendations from Human 1.0 metadata We significantly updated CEDAR’s value recommendation capabilities in this release. The new approach adopts a powerful mechanism to suggest new values for fields in a templates as users fill in those fields. If the template creator enables the value recommender for at least 2 text fields (using each field’s SUGGESTIONS tab), CEDAR will find patterns across all the previously entered values for recommendation-enabled fields. If those patterns are strong enough over many metadata instances, CEDAR will recommend values to a user based on the (enabled) values that the user has already filled in.

This functionality is documented in more detail here.

New Sharing Modes

Shared asset controls in workspace CEDAR now provides Shared With Me and Shared With Everybody modes on the workspace to let users see globally shared resources, or just resources explicitly shared with them. The default Workspace view just shows the files you own and administer. Shared with Me shows content that someone shared with you directly; and Shared with Everybody shows content that all CEDAR users can see (there are a lot of those!).

Your searches will always find everything you can see, no matter how it was shared.

Performance

We made many improvements to increase CEDAR’s performance. We improved the speed of searches and permission handling. We also worked on increasing the speed of Neo4j queries and queries to our REST APIs.

Perhaps most significantly, we improved auto-completion when filling in field values in the Metadata Editor. Thanks to BioPortal indexing every branch of terms in all its ontologies, CEDAR can quickly look up auto-completion values for even very large branches. This gives CEDAR consistently fast auto-completion and term discovery (and avoids unpleasant time-out warnings!).

New CEDAR Web Site

We created a new site at metadatacenter.org to present the most essential information about CEDAR. We will be using the new site for the latest updates (releases and blog posts) about CEDAR, but will continue to maintain other sites for technical documentation, videos, and general background (the original CEDAR web site).

You can find the latest links to all kinds of references about CEDAR documented at our references page.

Inside News

Open Metadata Microservice

Our goal is to make CEDAR metadata instances shareable on the web, and CEDAR’s metadata editing capabilities embedded in other sites. To do this, we’ve created an ‘open metadata microservice’ that will allow users to make their CEDAR resources openly accessible via REST calls to users without CEDAR accounts. We will provide user interfaces to create and access these open resources in the near future.

Inside Inside News

This release also incorporates a significant number of usability enhancements and bug fixes.

And, we updated all CEDAR Java-based components to work with Java 11.

All the tasks completed for the 2.3 release series can be found with this GitHub search, or by visiting the GitHub release page for release-2.3.

What’s going on related to CEDAR?

User Applications

About Common Data Elements

How CEDAR Adopts CDEs from caDSR

Additional Information

OpenView offers public option for CEDAR artifacts

Find field names in templates, elements, and fields

AIRR Community NCBI Pipeline

Added Human Tissue NCBI Pipeline

Inside News

Categories for Artifacts

Performance

Inside Inside News

New Value Recommender Service

New Sharing Modes

Performance

New CEDAR Web Site

Inside News

Open Metadata Microservice

Inside Inside News