Automated Database Mediation Using Ontological Metadata Mappings
نویسنده
چکیده
Background: One challenge of database federation is that the granularity of representation of equivalent data varies across systems. Dealing effectively with this problem is analogous to dealing with precoordinated vs. postcoordinated concepts in biomedical ontologies. Model Description: The authors describe an approach based on ontological metadata mapping rules defined with elements of a global vocabulary, which allows a query specified at one granularity level to fetch data, where possible, from databases within the federation that use different granularities. This is implemented in OntoMediator, a newly developed production component of our previously described Query Integrator System. OntoMediator’s operation is illustrated with a query that accesses three geographically separate, interoperating databases. An example based on SNOMED also illustrates the applicability of high-level rules to support the enforcement of constraints that can prevent inappropriate curator or power-user actions. Summary: A rule-based framework simplifies the design and maintenance of systems where categories of data must be mapped to each other, for the purpose of either cross-database query or for curation of the contents of compositional controlled vocabularies. J Am Med Inform Assoc. 2009;16:723–737. DOI 10.1197/jamia.M3031. by gest on Jauary 3, 2016 oxforrnals.org/ Introduction One challenge in federated database integration is that databases from various research groups may store information on the same category of data differently. Physical heterogeneity issues, such as different internal names for semantically equivalent tables/columns, data stored in a single table vs. multiple tables, data stored in columnmodeled vs. row-modeled form, and so on, have been addressed successfully through standard approaches such as database views and mappings of individual database schema elements to a global schema. However, semantic representation differences—notably due to equivalent information being stored at different granularity—cannot be addressed using these mechanisms. For example, in one database, various attributes of a concept—e.g., morphology, location, tissue type—may be represented as distinct fields, while in another database these attributes may be combined Affiliations of the authors: Center for Medical Informatics (LM, RW), Department of Anesthesiology (LM), Yale University School of Medicine, New Haven, CT; Geisinger Health Systems (PN), Danville, PA. This research is supported by NIH Grants R01 DA021253 and P01 DC04732. The authors thank the curators of the CCDB and CoCoDat databases for making their data available for use in this paper, and Dr. Gordon Shepherd for curating the neuron ontology used in this work. Correspondence: Luis Marenco, MD, Center for Medical Informatics, Yale University School of Medicine, PO Box 208009, New Haven, CT 06520-8009; e-mail: [email protected] . Received for review: 10/13/08; accepted for publication: 06/07/09. implicitly through the descriptive name of that concept. This paper describes a general approach to specifying equivalence between different concepts when such granularity differences exist. We describe an implementation for integration of neuroscience databases, and provide another example in the biomedical controlled vocabulary domain. This work may lay the foundation for database-contextual information integration in biosciences and other areas of research.
منابع مشابه
A Metadata Integration Assistant Generator for Heterogeneous Distributed Databases
This paper describes a metadata interchange approach for semi-automated integration of heterogeneous distributed databases. Our system prototype uses distributed metadata to generate a GUI tool for a meta-user (who does the metadata integration) to describe mappings between master and local databases by assigning index numbers and specifying conversion function names; the system uses Quilt as i...
متن کاملA Mediation Layer for Heterogeneous XML Schemas
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sourc...
متن کاملTowards Browsing Distant Metadata Using Semantic Signatures
In this document, we describe a light-weighted ontology mediation method that allows users to send semantic queries to distant data repositories to browse for learning object metadata. In a collaborative E-learning community, member data repositories might use different ontologies to control a set of vocabularies describing topics in learning resources. This could hinder the search of learning ...
متن کاملTowards Ontological Context Mediation for Semantic Web Database Integration: Translating COIN Ontologies Into OWL
The COntext INterchange (COIN) approach to information integration [Goh et al, 1999] [Bressan et al, 2000] [Firat et al, 2002] uses ontological mappings and enables powerful context-sensitive query mediation for semantic integration of knowledge across multiple heterogeneous database sources. Its existing applications include financial analysis for the financial services industry, as well as ai...
متن کاملWeaving a New Fabric of Natural History
Natural history offers an interestingly rich mix of traditional and modern ways of organizing data, information, and knowledge. The Linnaean tradition still defi nes the basis of how taxonomic knowledge of organisms is organized, while at the same time complementary perspectives on databases and ontologies are developed and implemented, to provide enhanced access to natural history collection d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014