Georgia Coastal Ecosystems Metadata System

The Georgia Coastal Ecosystems LTER Metadata Management System Summary

The Georgia Coastal Ecosystems (GCE) LTER site manages data and metadata using a comprehensive information system that currently includes a centralized relational database management server, custom network-enabled analytical software, and dynamic web applications. All project information is stored in relational databases, which are queried dynamically to produce rich metadata for data sets as well as a searchable web-based data catalog with links to related project information (publications, study sites, species lists, and personnel contact information). Metadata-based MATLAB software developed on site (GCE Data Toolbox) is used to automate data processing, validation and quality control of primary data, and to create derived data products. Metadata and processing lineage information generated by this software during data processing are synchronized with the database whenever data sets are archived or revised to provide a complete record. Tabular data and pre-formatted text metadata are distributed to users in standard and user-customized ASCII and MATLAB formats via the web-based data catalog and GCE Toolbox search client application. EML is generated on-demand from the centralized database server via dynamic web applications. Support for GIS data will also be provided in the future.

LTER sites with similar management systems

  • None. GCE has an advanced and unique management system; however, software and database designs developed at GCE have been openly shared with other LTER sites and the scientific community and can be leveraged by other sites

The Georgia Coastal Ecosystems Detailed

Most acquisition and processing of GCE data is currently done in MATLAB using the GCE Data Toolbox software. Metadata content is generated iteratively during data processing, with top-level information (title, abstract, personnel, study description, etc.) entered into a centralized SQL Server 2000 database using MS Access-based entry forms, or copied en masse from other related data sets via stored procedure and revised. Data table and attribute metadata are generated automatically during data processing, or loaded from pre-defined metadata templates for the data source, then manually edited using the GCE Data Toolbox metadata editor application. Additional metadata content, such as processing history (lineage), unit conversions, and calculations for derived fields, is automatically generated by the GCE Toolbox software during processing. Quality control is performed automatically based on detailed rules defined for each attribute (pre-defined in metadata templates or defined during analysis), with character flags assigned to individual values that meet rule criteria. Quality control can also be performed manually in a spreadsheet-like editor or on plots with the mouse, allowing automatically-assigned flags to be edited and new flags assigned visually, with all operations logged to the processing history. After data processing is complete, data sets are registered and versioned in the database programatically, at which time new and updated metadata content is synchronized back to the database server and distribution files are created and copied to the production web server. Editing and re-registering data sets automatically generates a new version of the data structure and distribution files, allowing prior versions to be retained (although only the most recent version is discoverable from the data catalog).

The GCE Data Toolbox software is publicly available online (in compiled form for non-GCE members), allowing MATLAB users to extensively customize, analyze and visualize GCE-distributed data sets and create their own derived data products with complete metadata. The entire catalog of primary and ancillary GCE data can be searched using an included search engine client, then automatically retrieved from GCE servers for local analysis. ClimDB/HydroDB and USGS NWIS data can also be queried and retrieved, then analyzed and integrated with GCE data in real time.

EML Status:

  • Completion: Metadata for all primary GCE data are available in EML (ESA-FLED text metadata provided for ancillary data)
  • Richness: All the harvested EML is rich, attribute level (level 5)
  • QA/QC: Files are revised and quality controlled systematically with automated and manual checks, documented in the metadata