Presenting Author Information |
|
Name |
Marcos Martinez-Romero |
Institution |
Stanford University |
BD2K Grant Number |
U54 AI117925 |
PI |
Mark A. Musen |
|
|
Phone Number |
6504228878 |
Additional Author Information |
|
Names and affiliations of additional authors (one per line) Martin J. O’ Connor, Stanford University |
|
Is there an additional contact person? |
Yes |
Name of additional contact |
Martin O’ Connor |
Email address of additional contact |
|
Additional information |
|
Please choose the topic that best fits your abstract (posters will be grouped according to your selection). Detailed session descriptions can be found in the Abstract Guidelines. |
Software, Analysis, & Methods Development |
Please consider my abstract for a (See Presentation Guidelines) |
Demo only (includes poster, power, table) |
Abstract Information |
|
Poster presentations may be submitted electronically in order to reach a wider audience and be available after the All hands meeting. Do you plan to submit your poster as a digital submission in addition to bringing a physical copy? |
Yes |
Abstract Title Faster and Better Metadata Authoring using CEDAR's Value Recommendations |
|
Abstract Description In biomedicine, good metadata is crucial to finding experimental datasets, to understand how experiments were performed, and to reuse data to conduct new analyses. Despite the growing number of efforts to define guidelines and standards to describe biomedical experiments, the impediments to creating accurate, complete, and consistent metadata are still considerable. Authoring good metadata is a tedious and time-consuming task that biomedical scientists tend to avoid. The Center for Expanded Data Annotation and Retrieval (CEDAR) is developing novel methods and tools to simplify the process by which investigators annotate their experimental data with metadata. The CEDAR Workbench (cedar.metadatacenter.net) is a set of Web-based tools for the acquisition, storage, search, and reuse of metadata templates. As a step towards decreasing authoring time while increasing metadata quality, we have enhanced the CEDAR Workbench with value recommendation capabilities. Our system identifies common patterns in the CEDAR metadata repository, and generates real-time suggestions for filling out metadata acquisition forms. These suggestions are context-sensitive, meaning that the values predicted for a particular field are generated and ranked based on previously entered values. Our value recommendation approach supports both free-text values and terms from ontologies and controlled terminologies. We discuss some of the challenges that have arisen while implementing our approach, and our strategies for making this capability useful to the end users of CEDAR. We demonstrate CEDAR's intelligent authoring capabilities using metadata from the Gene Expression Omnibus (GEO), and show how the technology that we are developing leverages existing metadata to make the authoring of high-quality metadata a manageable task. |