The Palmer LTER data and metadata management approaches

The Palmer LTER (PAL) site has designed a comprehensive information management environment centering on an information system called DataZoo. This system's architecture supports data aggregation, description, and interoperability. DataZoo is coded primarily in PHP, using an object-oriented design focusing on code reusability. The backend is a mySQL relational database while the frontend web interfaces for ingesting data and creating metadata as well as querying, viewing, plotting, editting, and downloading data, are enhanced through the use of JavaScript and Ajax. XLST style sheets are used to create easily readable versions of XML documents, particularly EML standard metadata.

LTER sites with similar management approaches

  • clone system : CCE
  • some similarities : SEV, MCM

The approach Details

PAL joined CCE in the SIO Ocean Informatics endeavor with goals of 1) creating a process for design of a community information system, 2) partnering with users and social scientists as codesigners, and 3) developing a local awareness and understanding of information infrastructure. Datazoo provides a number of services and tools to users. Users may browse and view datasets as well as perform basic plots and downloads of data (as CSV text files) and metadata (as EML standard XML files). For the local community, participants may upload and review data, perform quality assurance, and manage metadata. This system facilitates the quality control of the data using PHP scripts and code developed using the Yahoo JavaScript API serve dynamic web forms for the entry process. In addition, web interfaces provide three types of query capabilities: conditional, multi-dataset, and saved. Data integration is enabled by a trio of elements: a project-study-dataset relations architecture, a set of shared dictionaries (unit, attribute, and qualifier), and metadata description to the column level. JpGraph provides dynamic graphs as results of user driven queries of the data and metadata. PAL provides EML to the attribute level, with enhancements to support synthesis efforts. A two-tier user privilege system distinguishes community users from public users.

EML Status:

  • Completion: A fraction of the metadata is available in EML. DataZoo first release occurred during the 2007 summer; population is ongoing.
  • Richness : All the harvested EML is rich,attribute level (level 5)
  • QA/QC : Files are quality controlled initially through data ingestion templates. Data may be further quality controlled manually and revisions reuploaded.