wpe18.jpg (1715 bytes) wpe1A.jpg (8474 bytes)       

Table of Contents


Section 5. Data Management

The primary goals of data management at SGS LTER are to provide secure, long-term data storage and make high quality data easily accessible to LTER scientists and the public in a timely fashion. We achieve these goals through three techniques: an organized and efficient data tracking system, secure data archiving procedures, and fully utilizing our SGS World Wide Web (WWW) site.

Database Management Policies and Procedures

Data Tracking System

The implementation of our data tracking system (Fig. 5.1 ) means that data management starts long before data are collected. Investigators must submit a request form to the data manager as part of the procedure to conduct research. The form contains specific information about the study that the data manager logs into the tracking system. After entering preliminary information about a project into the system, the data manager tracks the progress of the dataset from data collection through to availability on the SGS WWW site. The SGS data tracking system increases the efficiency of data entry into the system, improves project-wide awareness of the scope of research activities at any given point in time, and improves overall data quality by providing consistency among datasets through standardized data collection forms.

Data Delivery and Verification

Contact with the investigator is maintained throughout the duration of the project and data are submitted to the data manager no later than three months following the end of the study. For experiments that use the LTER field crew, data are submitted directly to the data manager. The data manager then verifies each dataset with its data description to ensure that there are no inconsistencies between the actual data and the metadata. Immediately following verification, the data manager updates the database by storing each dataset as a separate table in the database and relating it to the metadata via a dataset ID code.

Data Archiving

The data manager archives the data and metadata by storing them together in one ASCII text file per dataset. Using this format for data storage ensures readability over the long term. Datasets are stored redundantly via the following four methods to provide security against accidental data loss or destruction:

  1. hard disk on a Sun workstation (daily)
  2. system level back-ups on high-density 8-mm cartridges (daily)
  3. SGS archives on high-density 8-mm cartridges (quarterly)
  4. original data forms are copied to be microfilmed

Data Access

The metadata to all datasets are made publicly accessible as soon as they are received from the investigator. Datasets are available to the public three years after the end date of the study or following the investigator's publication of study results (whichever comes first). Investigators are notified when data are scheduled to go public and may at this point submit a written request to extend the access restriction on their data. A committee reviews the request and determines whether the data may remain limited access and for how long.

SGS WWW Site

Once a dataset is verified, the metadata will be automatically posted to our SGS WWW site. Our goal is to use the SGS web site as the primary mode of information dissemination for SGS scientists, the scientific community, and the public. (The data access portion of our WWW site is currently being developed. We expect it to be operational by March 1, 1996. The rest of the site is fully functional now.)

The design of our web site will enable scientists everywhere as well as the public to easily access our on-line data library (see Tables 1.2, 1.3, and 1.4 for lists of the datasets currently managed by SGS-LTER). We have developed a standardized list of keywords that forms the foundation for searching the SGS data library on our WWW site. A user may either travel through the hierarchy of research categories to locate datasets of interest or may utilize a keyword search. When viewing the data, the structure of our home page allows the user the following options: 1. data files containing the actual rows and columns of data; 2. the metadata text, describing the study and its datasets; 3. graphs of the data which are generated "on the fly" as the user queries the dataset; 4. graphic images associated with datasets such as experiment designs, maps, photos of the study site etc.

We have recently assembled a comprehensive species list that is available on our home site. The list includes every species found at the site and is broken down into 7 categories: plants, birds, mammals, arthropods, microarthropods, nematodes and herpetiles. For a more complete description of our data management polices and procedures, please see our data management policy posted on our Web site.

Data Management Software

In the past, our data management system relied upon in-house software and programming that provided leading edge technology in the field. Given the rapid improvement of commercial database systems software, we have re-evaluated the efficiency of in-house software development. We are currently in the process of migrating to a relational database system (ORACLE). As we write this, we are moving our data into ORACLE, and working to connect the ORACLE database directly to the WWW site using ORACLE html tools. This custom tailored system and its associated applications will allow us to meet the specific needs of SGS scientists.

We are very excited about the future of our data management system and our ability to maximize the utility of our relational database management system (RDBMS) to our scientists and the public. We envision a state-of-the art RDBMS capable of providing high quality, long-term data storage and enhanced access to these data through a dynamic link to the SGS WWW site. Our web site will no longer store static data files, but will allow visitors to execute dynamic, on-the-fly data requests and analyses. By achieving a state of the art RDBMS, we will greatly contribute to the long-term success of SGS as an outstanding ecological research site well into the next century.

Data Management Personnel

The Data Manager is a full time position that includes the following responsibilities:

  1. working with scientists during experimental design and planning
  2. managing the data tracking system
  3. managing the database system
  4. updating and maintaining the SGS WWW site
  5. archiving datasets for long-term maintenance and storage
  6. developing and maintaining user-friendly applications
  7. assisting scientists with any data management/data access issues.

The GIS Data Manager is a half-time position that includes the following responsibilities:

  1. GIS data acquisition and management
  2. spatial analysis consultation
  3. project analysis
  4. map generation.

Undergraduate students are used as additional staffing resources.

GIS Data Management and Research Support

Management of GIS spatial data and metadata supports four functions:

  1. daily data management
  2. research support
  3. network access for local and remote users
  4. long term data archive.

Table 1.3 lists the GIS data managed by the SGS LTER project.

GIS Data Management

Daily GIS management accomplishes the collection of new data, extension of existing spatial data, and maintenance of metadata. Expansion of the SGS to include the Pawnee National Grasslands (PNG) allows us to acquire more ecologically complete landscape level data. Data new to the SGS study area include: (1) prairie dog town locations, (2) swift fox locations, (3) plant communities and associated range site descriptions, and (4) land use and Conservation Reserve Program (CRP) treatments.

We utilize an extended ARC/INFO data library structure for analysis and daily management of spatial data and metadata ( Fig. 5.3). These data are then made available across the WWW in several formats to accommodate the needs of investigators. Since many users simply wish to view the data, map views stored in a Map Atlas are accessible for viewing in raster format, and downloading in black-and-white or color postscript format for local printing of high-quality graphics.

A new method for access and retrieval of historical field study sites is now being adopted at SGS. This format stores each study location as a polygon in the Study Site library layer. This new format will allow scientists and data managers to more easily identify past and ongoing research based on plant or animal species, soil key words if appropriate, researcher names, dates of study, and of course geographic proximity. This structure will form a link between the GIS data library and the field data in the data management system.

GIS metadata at our site conform to the Content Standards for Geospatial Metadata. Approximately 75 percent of the metadata elements for this standard are appropriate and used at our site, with approximately 20 percent of these being required elements. This information is currently stored in relational database tables and accessible for internal use and maintenance. Text output files are made available for outside and network users. For new and recent data layers, the required metadata elements are complete. Metadata for spatial data preceding the standard, although well-documented, may never have all of the required elements we currently collect.

Research Support

GIS research analysis is conducted primarily using Arc/Info and IMAGINE software. These GIS analyses range from plant-level scanning and analysis of root characteristics, to plot-level identification of plant growth and mortality, to landscape-level and landscape-level assessments of nutrient run-off.

Network Access

Prior to the advent of WWW Internet viewers, we supported machine and software independent views of our SGS Map Atlas through on-line map images. These map images could be viewed within the Colorado State University network using Unix-based non-GIS viewing tools, or transferred to remote locations via file transfer protocol for viewing. This served primarily as a mechanism to facilitate communication and visualization for research. These views are now supported and accessible through our WWW site.

Long-term Data Archive

Purchased, SGS-automated, and project data are saved in duplicate on 8 mm tapes in the original format, with the second copy stored in a separate location from the first. Data automated or developed in-house are stored in Arc/Info export format and are reviewed yearly for compatibility maintenance. The final products of project data are stored and reviewed in a similar manner. Final products are also stored together with all associated work files on 8 mm tape in triplicate: two copies for our site and one copy for the researcher. These are identified with the name of the project, date of completion and the researchers' names.

References

Federal Geographic Data Committee, 1994. Content standards for digital geospatial     
    metadata (June 8). Federal Geographic Data Committee. Washington, D.C.

CPER Data Management Committee, 1994. Data Management for the CPER LTER Project
    (November 15).

Short Grass Steppe Long Term Ecological Research Site. Fort Collins, Colorado.

 

wpe36.jpg (1101 bytes)

wpe12.jpg (1377 bytes)                            wpe13.jpg (1298 bytes)                           wpe39.jpg (1211 bytes)  

 

02/08/01

 

About - Reports - Research - Data - Publications - Personnel -
News -
Links - What's New - Web Site Map - Home

To contact us, please email: Sallie Sprague  (Sallie.Sprague@colostate.edu)