Summarizing Neighbors and Modeling Frameworks

Summarizing Neighbors and Modeling Frameworks

The difference between the two classes is in the treatment of the values—implied surface configuration or direct numerical summary. For example, figure 5.5-2 shows a small portion of a typical elevation data set, with each cell containing a value representing its overall elevation. In the highlighted 3x3 window there are eight individual slopes, as shown in the calculations on the right side of the figure. The steepest slope in the window is 52% formed by the center and the NW neighboring cell. The minimum slope is 11% in the NE direction

But what about the general slope throughout the entire 3x3 analysis window? One estimate is 29%, the arithmetic average of the eight individual slopes. Another general characterization could be 30%, the median of slope values. But let''s stretch the thinking a bit more. Imagine that the nine elevation values become balls floating above their respective locations, as shown in Figure 5.5-3. Mentally insert a plane and shift it about until it is positioned to minimize the overall distances from the plane to the balls. The result is a "best-fitted plane" summarizing the overall slope in the 3x3 window.

The algorithm is similar to fitting a regression line to a set of data points in two-dimensional space. However in this case, it’s a plane in three-dimensional space. There is an intimidating set of equations involved, with a lot of Greek letters and subscripts to "minimize the sum of the squared deviations" from the plane to the points. Solid geometry calculations, based on the plane''s "direction cosines," are used to determine the slope (and aspect) of the plane.

Another procedure for fitting a plane to the elevation data uses vector algebra, as illustrated in the right portion of figure 5.5-3. In concept, the mathematics draws each of the eight slopes as a line in the proper direction and relative length of the slope value (individual vectors). Now comes the fun part. Starting with the NW line, successively connect the lines as shown in the figure (cumulative vectors). The civil engineer will recognize this procedure as similar to the latitude and departure sums in "closing a survey transect." The length of the “resultant vector” is the slope (and direction is the aspect).

There is a lot more to neighborhood analysis than just characterizing the lumps and bumps of the terrain. Figure 5.5-4 shows a direct numerical summary identifying the number of customers within a quarter of a mile of every location within a project area.

The procedure uses a roving window to collect neighboring map values and compute the total number of customers in the neighborhood. In this example, the window is positioned at a location that computes a total of 91 customers within quarter-mile.

Note that the input data is a discrete placement of customers while the output is a continuous surface showing the gradient of customer density. While the example location does not even have a single customer, it has an extremely high customer density because there are a lot of customers surrounding it.

The map displays on the right of the figure show the results of the processing for the entire area. A traditional vector GIS forces the result into a set of 2D contour intervals stored as discrete polygon spatial objects—1-10 customer range, 10-20, 20-30, etc. The 3D surface plot, on the other hand, shows all of the calculated spatial detail—mountains of high customer density and valleys of low density. An importance difference is that the vector representation aggregates the results, whereas the grid representation contains all of the detailed information.

Figure 5.5-5 illustrates how the information was derived. The upper-right map is a display of the discrete customer locations of the neighborhood of values surrounding the “focal” cell. The large graphic on the right shows this same information with the actual map values superimposed. Actually, the values are from an Excel worksheet with the column and row totals indicated along the right and bottom margins. The row (and column) sum identifies the total number off customers within the window—91 total customers within a quarter-mile radius.

This value is assigned to the focal cell location as depicted in the lower-left map. Now imagine moving the ‘Excel window’ to next cell on the right, determine the total number of customers and assign the result—then on to the next location, and the next, and the next, etc. The process is repeated for every location in the project area to derive the customer density surface

The processing summarizes the map values occurring within a location’s neighborhood (roving window). In this case the resultant value was the sum of all the values. But summaries other than Total can be used—Average, StDev, CoffVar, Maximum, Minimum, Median, Majority, Minority, Diversity, Deviation, Proportion, Custom Filters, and Spatial Interpolation. The remainder of this series will focus on how these techniques can be used to derive valuable insight into the conditions and characteristics surrounding locations—analyzing their spatially-defined neighborhoods.

6.0 GIS Modeling Frameworks

It is commonly recognized that there are three essential elements to GIS— data, operations and applications. To use the technology a set of digital maps, an analytic engine to process the maps, and interesting problems to solve are needed. However, not all users have the same view of the relative importance of the three elements. Some have a data-centric perspective, as they prepare individual data layers and/or assemble the comprehensive databases forming the cornerstone of GIS. Others are operations-centric and are locked in on refining and expanding the GIS toolbox of processing and display capabilities. A third group is applications-centric and sees the portentous details of data and operations as merely impediments to solving real-world problems. Such is the occasionally fractious fraternity of GIS.

In the early years, data and software development dominated the developing field. As GIS has matured, the focus has extended to innovative ways of addressing complex spatial problems beyond simply mapping and geo-query. As a result, attention is increasingly directed toward the assumptions and linkages embedded in GIS models that weave data layers into logical expressions of spatial interrelationships that solve pressing problems. From this perspective there are three dominant GIS modeling frameworks— Suitability, Decision Support and Statistical modeling.

6.1 Suitability Modeling

A simple habitat model can be developed using only reclassify and overlay operations. For example, a Hugag is a curious mythical beast with strong preferences for terrain configuration: prefers low elevations, prefers gentle slopes, and prefers southerly aspects.

6.1.1 Binary Model

A binary habitat model of Hugag preferences is the simplest to conceptualize and implement. It is analogous to the manual procedures for map analysis popularized in the landmark book Design with Nature, by Ian L. McHarg, first published in 1969. This seminal work was the forbearer of modern map analysis by describing an overlay procedure involving paper maps, transparent sheets and pens.

For example, if avoiding steep slopes was an important decision criterion, a draftsperson would tape a transparent sheet over a topographic map, delineate areas of steep slopes (contour lines close together) and fill-in the precipitous areas with an opaque color. The process is repeated for other criteria, such as the Hugag’s preference to avoid areas that are northerly-oriented and at high altitudes. The annotated transparencies then are aligned on a light-table and the transparent areas showing through identify acceptable habitat for the animal.

An analogous procedure can be implemented in a computer by using the value 0 to represent the unacceptable areas (opaque) and 1 to represent acceptable habit (clear). As shown in figure 6.1-1, an Elevation map is used to derive a map of terrain steepness (Slope_map) and orientation (Aspect_map). A value of 0 is assigned to locations Hugags want to avoid—

- Greater than 1800 feet elevation = 0 …too high

- Greater than 30% slope = 0 …too steep

- North, northeast and northwest = 0 …to northerly

All other locations are assigned a value of 1 to indicate acceptable areas. The individual binary habit maps are shown displays on the right side of the figure. The dark red portions identify unacceptable areas that are analogous to McHarg’s opaque colored areas delineated on otherwise clear transparencies.

A Binary Suitability map of Hugag habitat is generated by multiplying the three individual binary preference maps. If a zero is encountered on any of the map layers, the solution is sent to zero (bad habitat). For the example location on the right side of the figure, the preference string of values is 1 * 1 * 0 = 0 (Bad). Only locations with 1 * 1 * 1 = 1 (Good) identify areas with out any limiting factors—good elevations, good slopes and good orientation. These areas are analogous to the clear areas showing through the stack of transparencies on a light-table.

While this procedure mimics manual map processing, it is limited in the information it generates. The solution is binary and only differentiates acceptable and unacceptable locations. But an area that is totally bad (0 * 0 * 0 = 0) is significantly different from one that is just limited by one factor (1 * 1 * 0 = 0) as two factors are acceptable, thus making it nearly good.

Ranking Model