Metadata Capture and Geospatial Records

نویسنده

  • Elizabeth Perkes
چکیده

When the electronic records that you are trying to preserve are unique, complex, and storage-hungry, they will quickly put an institution’s feet to the fire to come up with solutions. This has been the case for Utah, North Carolina, and Kentucky as we have tried to grapple with the needs and requirements of geospatial records in the grant-sponsored GeoMAPP project (http://www.geomapp.net). Much of what we have learned while studying geospatial records can be broadly applied to other types of electronic records. For instance, digitized images of the earth will have similar preservation requirements as documents that have been scanned, but with the added metadata needed to make sense of geospatial imagery. Geospatial data in the form of shapefiles or geodatabases also come with their own descriptive metadata, which must be captured along with the technical metadata, and reused for purposes of access and preservation. This session will focus on the nature of this metadata and the commonalities found with other types of electronic records, while we share the specific strategies and tools that we are developing. One such tool is an application created by the Utah State Archives, called the APPX-based Archives Enterprise Manager (AXAEM). This platform and database-independent open-source software is used to manage the entire workflow of the archives, and recent development has added the ability to ingest metadata of various types into the system and link it to the bibliographic data of series. A demonstration of this tool will be given. Digital Geospatial Datasets and Their Metadata When preserving geospatial datasets, archivists encounter the usual challenges associated with preserving born digital objects, such as dependence on special software applications, transferring and preserving “authentic” or “trustworthy” digital artifacts, and creating an appropriate archival metadata record that facilitates and ensures the access and manageability of digital assets into the future. Geospatial datasets are produced from geographical information systems (GIS) which combine graphical representations depicting geographical features with tabular data that store information related to those features. At one level, GIS can be considered as a sort of electronic map that is supplemented with an underlying database [1]. A GIS dataset for hospitals can hold the geographical point locations for each of the hospitals in a state, plus store additional information associated with each hospital such as its name, address, telephone number, emergency services, and number of beds (see Figures 1 and 2). Figure 1: Esri ArcMap view of 3 datasets: North Carolina (N.C.) Hospitals (white dots), N.C. Airports(black dots), and 2001 N.C. Congressional Districts Figure 2: Esri ArcCatalog view of data in the N.C. Hospitals dataset Geospatial datasets are similar to other digital assets in that they are generally created by specialized application software, and specialized application software is also required to read or update existing geospatial datasets. In many cases, the format of the geospatial dataset is vendor specific, and can only be read and/or written by tools provided by that software vendor. There are some formats, such as Esri’s Shapefile format [2], which have been published, and have non-vendor-specific rendering tools available. However, geospatial data formats are more complex than most other common digital formats. Unlike digitized document files, image files, and audio files where the digital asset and its associated metadata are contained in a single file, geospatial datasets are often composed of numerous files, and often have a separate rich metadata file. The Federal Geographic Data Committee (FGDC) is a national committee that “promotes the coordinated development, use, sharing and dissemination of geospatial data on a national basis.”[3] The FGDC is tasked by Presidential Executive Orders to “develop procedures and assist in the implementation of a distributed discovery mechanism for national digital geospatial data.”[4] The FGDC has developed the Content Standard for Digital Geospatial Metadata (CSDGM), a rich metadata standard to describe geospatial data [5]. The CSDGM contains several Archiving 2011 Final Program and Proceedings 125 subsections that include descriptive, technical, provenance, and administrative metadata elements, and also specifies which metadata elements are required. In addition, CSDGM defines fields to record the lineage and processing history of the dataset, also useful for informing provenance-related archival records. Archivists have long advocated for metadata creation to accompany the creation of the digital record. GIS software packages promote this best practice, as they offer interfaces for GIS developers to create the metadata to describe their datasets. The GIS creator can fill in traditional metadata fields such as creator, date created, and abstract (see Figure 3a). The GIS software might even assist the GIS developer by automatically populating technical metadata fields such as the GIS software application name and version, and host operating system, which are important metadata elements for archivists and the digital object’s future sustainability. The software may also extract geospatial characteristics directly from the GIS dataset and populate the corresponding metadata fields, further increasing the reliability of the metadata and reducing human labor and the opportunity for human error. To promote the accessibility of the metadata, tools are available to export the metadata in a standard XML format (see Figure 3b), which can serve as a useful input for automating archival metadata production. Figure 3a: Excerpt: GIS metadata for N.C. 2001 Congressional Districts dataset Figure 3b: XML Excerpt: GIS metadata for N.C. 2001 Congressional Districts

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the geospatial consistency of digital libraries metadata

Consistency is an essential aspect of the quality of metadata. Inconsistent metadata records are harmful: given a themed query, the set of retrieved metadata records would contain descriptions of unrelated or irrelevant resources, and would even do not contain some resources considered obvious. This is even worse in when the description of the location is inconsistent. Inconsistent spatial desc...

متن کامل

بررسی واکنش موتورهای کاوش وب به پیشینه‌های فرادا‌ده‌ای مبتنی برروش ترکیبی داده‌های خرد و روش داده‌های پیوندی

The purpose of this research was to find out the reaction of Web Search Engines to Metadata records created based on the combined method of Rich Snippets and Linked Data. 200 metadata records in two groups (100 records as the control group with the normal structure and, 100 records created based on microdata and implemented in RDF/XML as experimental group) extracted from the information gatewa...

متن کامل

Describing Geospatial Assets in the Web of Data: A Metadata Management Scenario

Metadata management is an essential enabling factor for geospatial assets because discovery, retrieval, and actual usage of the latter are tightly bound to the quality of these descriptions. Unfortunately, the multi-faceted landscape of metadata formats, requirements, and conventions makes it difficult to identify editing tools that can be easily tailored to the specificities of a given project...

متن کامل

Semantic Integration of Geospatial Data from Earth Observations

We propose an approach to semantically enrich metadata records of satellite imagery with external data. As a result we are able the identify relevant images using a larger set of matching criteria. Conventional methods for annotating data sets are usually based on metadata records (with attributes such as title, provider, access mode, and spatiotemporal characteristics), which offer a narrow vi...

متن کامل

Design and Implementation of Wuhan Geospatial Information Sharing Platform

Geospatial metadata, data, and services have been widely collected, developed and deployed in recent years. This flourishing of geospatial resources also added to the problem of geospatial heterogeneity. Interoperability research and implementation are needed for advancement in potential solutions to integrate and interoperate these widely dispersed geospatial resources. We design and implement...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011