Integrating Different Data Sources 

INTEGRATING DIFFERENT DATA SOURCES
Formats
- many different format standards exist for geographical data
- some of these have been established by public agencies
- e.g. the USGS in cooperation with other federal agencies ihas developed SDTS (Standard Data Transfer Standard) for geographical data
Clearinghouse
Federal Geographic Data Committee FGDC

METADATA standards

Search and retrieval of spatial data

Where is it? Who has it? How can I get it? In what form can I get it?

Framework

Standards for digital data development

Seven or eight major data layers

National On-line database


Regional Coordinators and Local Producers
- e.g. the Defense Mapping Agency (DMA) has developed the DIGEST data transfer standard
- some have been defined by vendors
- e.g. SIF (Standard Interchange Format) is an Intergraph standard for data transfer
DXF for CAD/CAM data
- a good GIS can accept and generate datasets in a wide range of standard formats
Projections
- there are many ways of representing the curved surface of the earth on a flat map
- some of these map projections are very common, e.g. Mercator, Universal Transverse Mercator (UTM), Lambert Conformal Conic
- each state has a standard SPC (State Plane Coordinate system) based on one or more projections
- see Unit 27 for more on map projections
- a good GIS can convert data from one projection to another, or to latitude/longitude
- input derived from maps by scanning or digitizing retains the map''s projection
- with data from different sources, a GIS database often contains information in more than one projection, and must use conversion routines if data are to be integrated or compared
Scale
- data may be input at a variety of scales
- although a GIS likely will not store the scale of the input document as an attribute of a dataset, scale is an important indicator of accuracy
- maps of the same area at different scales will often show the same features
- e.g. features are generalized at smaller scales, enhanced in detail at larger scales
- variation in scales can be a major problem in integrating data
- e.g. the scale of most input maps for a GIS project is 1:250,000 (topography, soils, land cover) but the only geological mapping available is 1:7,000,000
- if integrated with the other layers, the user may believe the geological layer is equally accurate
- in fact, it is so generalized as to be virtually useless
Resampling rasters
- raster data from different sources may use different pixel sizes, orientations, positions, projections
- resampling is the process of interpolating information from one set of pixels to another
- resampling to larger pixels is comparatively safe, resampling to smaller pixels is very dangerous
From Unit 6
6E. DATA SOURCES
Primary data collection
Secondary data sources

6F. STANDARDS
Sharing data
Agency standards

6G. ERRORS AND ACCURACY
Original Sin - errors in sources
Boundaries

Classification errors

Data capture errors

Accuracy standards


REFERENCES
Burrough, P.A., 1986. Principles of Geographical Information Systems for Land Resources Assessment, Clarendon, Oxford. Chapter 4 reviews alternative methods of data input and editing for GIS.
Chrisman, N.R., 1978. "Efficient digitizing through the combination of appropriate hardware and software for error detection and editing," International Journal of Geographical Information Systems 1:265-77. Discusses ways of reducing the data input bottleneck.

Drummond, J., and M. Bosman, 1989. "A review of low-cost scanners," International Journal of Geographical Information Systems 3:83-97. A good review of current scanning technology.

Ehlers, M., G. Edwards and Y. Bedard, 1989. "Integration of remote sensing with GIS: a necessary evolution," Photogrammetric Engineering and Remote Sensing 55(11):1619-27. A recent review of the relationship between the two technologies.

Goodchild, M.F. and B.R. Rizzo, 1987. "Performance evaluation and work-load estimation for geographic information systems," International Journal of Geographical Information Systems 1:67-76. Statistical analysis of costs of scanning.

Lai, Poh-Chin, 1988. "Resource use in manual digitizing. A case study of the Patuxent basin geographical information system database," International Journal of Geographical Information Systems 2(4):329-46. A detailed analysis of the costs of building a practical database.

Marble, D.F., J.P. Lauzon, and M. McGranaghan, 1984. "Development of a Conceptual Model of the Manual Digitizing Process," Proceedings of the International Symposium on Spatial Data Handling, Volume 1, August 20-24, 1984, Zurich Switzerland, Symposium Secretariat, Department of Geography, University of Zurich-Irchel, 8057 Zurich, Switzerland. Conceptual discussion of the digitizing process.

Peuquet, D. J., 1981. "Cartographic data, part I: the raster-to-vector process," Cartographica 18:34-48.

Peuquet, D. J., 1981. "An examination of techniques for reformatting digital cartographic data, part II: the vector-to-raster process," Cartographica 18:21-33.

Peuquet, D. J., and A. R. Boyle, 1984. Raster Scanning, Processing and Plotting of Cartographic Documents, SPAD Systems, Ltd., P.O. Box 571, Williamsville, New York, 14221, U.S.A. A comprehensive discussion of scanning technology.

Tomlinson, R.F., H.W. Calkins and D.F. Marble, 1976. Computer Handling of Geographical Data, UNESCO Press, Paris. Comparison of input methods and costs of 5 GISs.

DISCUSSION AND EXAM QUESTIONS
1. In his book Computers and the Representation of Geographical Data (Wiley, New York, 1987), E.E. Shiryaev argues that maps must be redesigned to be equally readable by humans and computer scanners, and that this would ultimately make scanning much more cost-effective than digitizing. How might this be done, and what advantages would it have?
2. The cost of digitizing has remained remarkably constant over the past 20 years despite dramatic reductions in computer hardware and software cost. Why is this, and what impact has it had on GIS? Do you predict any change in this situation in the future?

3. "Digitizing is a suitable activity for convicted criminals." Discuss.

4. As manager of a GIS operation, you have the task of laying out rules which your staff must follow in digitizing complex geographical lines. What instructions would you give them to ensure a reasonable level of accuracy? Assume they will be using point mode digitizing, and that points will be connected by straight lines for analysis and output.

5. What type of documents are best suited for automatic scanning?

6. After reading the article by Marble, Lauzon and McGranaghan on the conceptual model of digitizing, describe and explain the importance of map pre-processing.

Hosted by uCoz