wpe2.jpg (1824 bytes)         wpe38.jpg 
(7995 bytes)       

Mission

The primary goal of data management is to provide long term storage and maintenance of the Shortgrass Steppe LTER data as well as access to the data by LTER scientists, students, and the public. The design of our archival procedures, relational database management system, and web-based data access system are all oriented toward achieving this goal. In providing access to SGS LTER data by scientists at Colorado State University, we are also considering the needs of scientists’ worldwide to access the data. The second goal for data management is to assist the LTER scientists in the analysis of the data and the use of the data in modeling activities.

Overview

Information management ideally starts before data collection is ever started at the site.  Communication between researchers and the information management (IM) team begins with project initiation.  The IM team remains involved during data collection, verification, entry, QA/QC, archival, and publication (Brunt 2000).  Currently, additional data are being directly downloaded to our database from data loggers in the field, thus shortening the time between data collection and entry.  After digital data are assured for quality, the information is transferred to the SGS-LTER Relational Database Management System (RDBMS) residing on the SGS-LTER server, where information becomes accessible to the public through our website (http://sgs.cnr.colostate.edu/Data/DataLibrary.htm).

We have developed a strong web-based tool with the Agricultural Research Service (ARS) for researchers to contribute metadata to the database.  An online form allows researchers to automatically enter metadata into the SGS-LTER Access RDBMS.  End users may query the database for project information dating back to 1940.  We will continue to develop and refine these tools to participate in developing metadata content standards across the Network. (Please see http://sgs.cnr.colostate.edu/ars)

Meteorological data from the site are transferred to Colorado State University by modem each night. The data are then processed by a data "filter" that verifies that the data values fall within reasonable ranges. Where errors occur in the data stream the filter reports the errors and replaces the data with missing value codes.

Information management for the SGS LTER project has traditionally relied upon using flat ASCII files to store data, ASCII files to describe the data (metadata files), and locally written programs to access, view, plot and download the data. However, in 1996 all of the SGS LTER data and metadata was loaded into an Access relational database. This database allows information management personnel and scientists to query any part of the database via our web site. ASCII text versions of these datasets are also still available.  Our information management strategy is based upon two basic tenets for data management at the SGS LTER: (1) data are to be maintained in ways that ensure that the data will be accessible several decades from now, and (2) the analysis of data is to be conducted by the LTER scientists. The role of the information management staff is to see to it that data are properly recorded, transcribed, documented and stored to meet these two goals.

The scope of information management activities at the SGS LTER has been expanding over the last few years. GIS data are being used extensively for some studies, synthetic analyses and regional scale extrapolations are continuing to be of interest to the LTER scientists, and collaborative efforts for exchanging data between LTER sites have been initiated (Burke and Lauenroth 1993, Lauenroth et al 1993). As a result of the 1993 site review we have critically re-examined our current data management plan.

The Current Data Base System

The LTER database is an Access relational database, which is run on a Microsoft server housed within the Natural Resources Ecology Laboratory at Colorado State University. These data are the primary data for the project and consist of the original verified field observations. The LTER database is backed up to 8-mm cassettes as part of the standard backup procedure for the local network. In addition, weekly database backup files are completed. The information manager retains one copy of the data, and the PI keeps another copy.

The data of the Access database include field data from experiments or monitoring studies conducted at Shortgrass Steppe field sites. These data include observations collected prior to the start of the LTER project, primarily from the International Biological Program's Pawnee site. Associated with each data file should be a file that provides a description of the format of the data, the name of the investigator responsible for the data, methods used for collecting the data, problems encountered with the collection of the data, and other pertinent information. Such documentation of a data set is often called metadata. The documentation of the data is essential if the data are to be used in the future. Many of the data sets collected under the IBP do not yet have adequate data description files. We are adding these descriptions to the database as time permits.

The investigators associated with the LTER project have offices in many different buildings across the campus. In addition, the LTER database is used by scientists across the nation. We have developed tools for accessing the data in the LTER database from anywhere that has access to the internet. The system is based on a series of WWW forms that allow the user to query and download specific datasets using active server pages. Our user-friendly website provides an interactive interface to the database.

LTER information management also includes the maintenance of a bibliographic database for publications related to the LTER project and the SGS site. The list of publications is updated annually, printed and distributed. The bibliography is also maintained on the network. Recently we have begun to investigate the feasibility of providing bi-directional links between entries in the bibliographic database and the metadata files for the data used in the reference. The SGS LTER bibliographic database is searchable by author, keyword, year, and publication type.

Inter-site Information Management Activities

The LTER data managers agreed at the July, 1993 Data Manager's meeting to pursue methods to facilitate the exchange of data between the sites. The strategy to be employed is to use a common metadata format to describe the data being transferred. Current efforts at the LTER Network level include the development of a Network Information System that will facilitate synthetic uses of common LTER datasets.

SGS-LTER participates in DTOC (Data Table of Contents), Personnel database, CLIMDB (All Site Climate Database), ANPP (All Site Annual Net Primary Production), and All Site Bibliography Network Information Systems (NIS) modules that are maintained by the Network Office (http://lternet.edu/data).

As we progress through the new millennium, the “Decade of Synthesis and Standardization” of metadata continues to be a pressing issue at both the site and Network level (Stafford, personal communication and Baker et al. 2000). The information management team is also excited about its active involvement with developing a “content standard” for metadata within the LTER community.  This new tool, called Ecological Metadata Language (EML), will simplify data access (http://caplter.asu.edu/data/metadata/workshop012002.htm).

Policies for Data Management

Definitions

Restricted Access

Restricted access limits access to a data set to the investigator responsible for its collection. The investigator can provide the data to others, but he or she will have the authority to make the decision to approve or deny access to the data.

Open Access

Open access permits anyone who so desires to access and use a set of data without requiring permission from the investigator. However, all users of SGS LTER data must notify the data manager and acknowledge the SGS LTER in any publications that result from the data.

Long Term Data Set

A long-term data set is one that spans more than three years, and in which the data is clearly part of an ongoing study.

Data Access Review Committee

The data access review committee is envisioned as consisting of one LTER PI, one non-LTER scientist, and one grad student. If a question arises regarding an extension request, a scientist from the field of study related to the data set in question shall be consulted to help evaluate the request. Metadata the information that describes a data set, including specific features of the file format, units for variables (fields), definitions for fields plus general information describing the experimental design, study site, etc.

Access and Proprietary Rights

1.     Metadata would be accessible to all (open access)

2.     There is a 3-year interval from the end of field work during which data will be maintained as restricted access for short-term studies. After this 3-year period, the data shall change to open access. If the investigator wishes to keep data access restricted for longer than 3 years, the investigator must provide a written request annually to justify the restricted access extension. The data access review committee will review the requests and recommend whether restricted access should be continued. In the absence of a written request the data will be assigned to the open access category. Publication of the data should weigh against continuing restricted access, but factors such as a long period from acceptance to publication should be weighed in favor of maintaining restricted access. The data access review committee shall meet on an annual basis to review written justifications which request the extension of restricted access on particular data sets.

3.     After 6 years from the end of the field work data will be placed in the open access category unless there are valid extenuating circumstances to justify further retention in the restricted access class. More leniency should be given for maintaining restricted access on long term data sets.

  1. Long term data sets should be exposed to a "moving window" on the access policy, such that after year 4 or following publication of the Investigators results, the first year's data is liable for reclassification to open access unless a request for maintaining restricted access is made.

Quality control

1.     Metadata provided by a researcher would require review and a recommendation for acceptance by another scientist. The researcher would be asked to provide the names of potential reviewers.

2.     A mechanism for eliciting comments on the content of metadata is established in order to provide additional feedback for improvement of quality.

3.     A standardized list of keywords for describing a data set in the metadata will be prepared and used when appropriate. These keywords may be a subset of the keywords used in the bibliography.

4.     That rules for developing species codes will be defined. Existing species codes could be listed as well. We will consider using species codes developed for use with the SCS, BLM or the Plant Information Network (PIN).

5.     Maps will be associated with metadata for field experiments, showing locations of experimental areas.

  1. A standard for stakes and marking plates used in the field will be developed.

Responsibilities of the Information Management Staff and Scientists

1.     Data should be turned in within 3 months after the end of an experiment or field season, with exceptions being granted due to factors such as sample analysis delays.

2.     In order to keep track of the expected date for submission of data, a simple form will be created and distributed along with the ARS and PNG form by which investigators request permission to use the site. The form should also be submitted prior to conducting greenhouse or laboratory experiments for the LTER program.

3.     Non-LTER scientists who collect data relevant to the LTER project should also be encouraged to submit data for inclusion in the database. If the data cannot be obtained, such scientists should be requested to submit metadata describing their data sets so that a record is kept of their research.

4.     Priorities for submission of data and entry into the system will be set by the information management staff, but that they also have the responsibility for informing the scientists if other priorities will delay getting the data into the system in a timely fashion.

5.     Material will be prepared to hand to scientists who are contemplating research under LTER and also to be used prior to starting laboratory or field experiments. The material will describe the purpose of data management, the responsibilities of the scientists and data management staff, standards for conventions such as keywords, species lists, and location data, and forms for preparing metadata.

  1. The responsibilities of the information management staff, in order of priority, should be (1) to insure that data are placed in the data management system on a timely basis and that the data undergo quality control checks, (2) that the data be made accessible to the appropriate people, depending on access status, and (3) that scientists be provided with analytical support at the level of providing data in a form that can be used by others, plus help with preparing graphics or other non-technical analyses. Not to be directly supported are tasks such as statistical analyses.

References

Baker, K.S., B.J. Benson, D.L. Henshaw, D. Blodgett, J.H. Porter, and S.G. Stafford. 2000. Evolution of a multisite network information system: The LTER information management paradigm.  BioScience 50: 963-978.

Brunt, J.W. 2000.  Data Management Principles, Implementation, and Administration.  Pp. 25-47 in W.K. Michener and J.W. Brunt.  Ecological Data Design, Management and Processing.  Blackwell Science Ltd., Oxford, UK.

Burke, I.C. and W.K. Lauenroth. 1993. What do LTER results mean? Extrapolating from site to region and decade to century. Ecol. Mod. 67:49-80.

Lauenroth, W.K., D.L. Urban, D.P. Coffin, W.J. Parton, H.H. Shugart, T.B. Kirchner, and T.M. Smith. 1993. Modeling vegetation structure-ecosystem process interactions across sites and ecosystems. Ecol. Mod. 67:49-80.

Last Updated February 28, 2002

wpe27.jpg (1101 bytes)

wpe2C.jpg (1198 bytes)                                                                                 wpe2D.jpg (1211 bytes)

 

02/28/02


About - Reports - Research - Data - Publications - Personnel -
News -
Links - What's New - Web Site Map - Home

To contact us, please email: Sallie Sprague  (Sallie.Sprague@colostate.edu)