Geography 176B Lecture 10

LECTURE 10: ANALYSIS (2): TRANSFORMATIONS

1. BUFFERING

Transformations create new objects and data sets from existing objects and data sets

buffering takes points, lines, or areas and creates areas
every location within the resulting area is either:
in/on the original object
within the defined buffer width of the original object

Two versions

discrete object:
for every object, result is a new polygon object
new objects may overlap

field (objects cannot overlap):
every location on the map has one of two values:

inside buffer distance
outside buffer distance

every location on the map has a value of distance to the nearest object

Applications

CSISS cookbook and UCI's medical center

2. POINT IN POLYGON

Determine whether a given point lies inside or outside a given polygon

a type of spatial join

assign a set of points to a set of polygons

e.g. count numbers of accidents in counties
e.g. whose property does this phone pole lie in?

Algorithm

draw a line from the point to infinity
count intersections with the polygon boundary
inside if the count is odd
outside if the count is even
diagram

Field case

point must lie in exactly one polygon

where are the California ozone monitoring stations?

and how are they distributed by California habitat?

Discrete object case

point can lie in any number of polygons, including zero

3. POLYGON OVERLAY

Create polygons by overlaying existing polygons

how many polygons are created when two polygons are overlaid?

Discrete object case

find overlaps between two polygons
e.g. a property and an easement
creates a collection of polygons

Field case

overlay two complete coverages
creates a new coverage
e.g. find all areas that are owned by the Forest Service and classified as wetland
in vector or raster

in raster the values in each cell are combined, e.g. added

Areal interpolation

determining attributes for zones from other non-congruent zones

source zones

attributes are known

target zones

attributes are needed

overlay polygons, measure areas, use as weights

diagram

California example

4. SPATIAL INTERPOLATION

What is interpolation?

intelligent guesswork
an interval/ratio variable conceived as a field
temperature
soil pH
population density
sampled at observation points
needed:
values at other points
a complete surface
a contour map
a TIN
a raster of point values

Two methods commonly used in GIS

inverse-distance weighting (IDW)
Kriging (geostatistics)

Moving average/distance weighted average/inverse distance weighting

estimates are averages of the values at n known points
known values z₁,z₂,...,z_n
unknown value z = Sum over i (w_iz_i) / Sum over i (w_i)
where w is some function of distance, such as:
w = 1/d^k
w = e^-kd
an almost infinite variety of algorithms may be used, variations include:
the nature of the distance function
varying the number of points used
the direction from which they are selected

is the most widely used method
objections to this method arise from the fact that the range of interpolated values is limited by the range of the data
no interpolated value will be outside the observed range of z values
peaks and pits will be missed if they are not sampled
outside the area sampled the surface must flatten to the average value
summary: IDW is popular, easy, but full of problems

Example

ozone concentrations at CA measurement stations
objectives:
1. estimate a complete field, make a map
2. estimate ozone concentrations at other locations
e.g. cities

data sets:
measuring stations and concentrations (point shapefile)
CA outline (polygon shapefile)
DEM (raster)
CA cities (point shapefile)
IDW wizard in Geostatistical Analyst
opening screen defines data source
next screen defines interpolation method
which power of distance? (2)
how many sectors? (4)
how many neighbors in each sector? (10-15)
next screen gives results of cross-validation
results map
things to notice
amount of detail where there is no data
generally smooth surface
highs in LA, S central valley

Kriging

developed by D.G. Krige as an optimal method of interpolation for use in the mining industry
the rate at which the variance between points changes over space

expressed in the variogram

shows how the average difference between values changes with distance

analysis of the data

then application to interpolation

Variograms

vertical axis is E(z_i - z_j)²
the average difference in elevation of any two points distance d apart
d (horizontal axis) is distance between i and j
most variograms show behavior like the diagram
sill: the upper limit (asymptote)
range: distance at which this limit is reached
nugget: intersection with the y axis

Deriving the variogram

an irregularly spaced sample of points
divide the range of distance into a set of discrete intervals

e.g. 10 intervals between distance 0 and the maximum distance
for every pair of points, compute distance and the squared difference in z values
assign each pair to one of the distance ranges

accumulate total variance in each range

compute the average variance in each distance range
plot this value at the midpoint distance of each range
fit one of a standard set of curve shapes to the points
"model" the variogram

Computing the estimates

variogram is used to estimate distance weights for interpolation
weights are selected so that the estimates are:
unbiased (if used repeatedly, Kriging would give the correct result on average)
minimum variance (variation between repeated estimates is minimum)
problems with this method:
when the number of data points is large this technique is computationally very intensive
the estimation of the variogram is not simple, no one technique is best
results from this technique can never be absolute

example
selection of method
simple Kriging
co-Kriging includes a correlated variable
indicator Kriging is for binary data
analysis of the variogram
fitting a model
directional effects
how many neighbors?
cross-validation
things to notice
similar pattern
less detail in remote areas
smoother
rebounds to the mean at the edge
better cross-validation

5. DENSITY ESTIMATION

Suppose you had a map of discrete objects and wanted to calculate their density

density of population
density of cases of a disease
density of roads in an area
density would form a field
density estimation is one way of creating a field from a set of discrete objects

Methods

count the number of points in every cell of a raster
measure the length of lines, e.g. roads
result depends on cell size
result is very noisy, erratic

Density estimation using kernels

think of each point being replaced by a pile of sand of constant shape
add the piles to create a surface
example kernel
width of the kernel determines the smoothness of the surface

Density estimation and spatial interpolation applied to the same data

density of ozone measuring stations
using Spatial Analyst

kernel is too small (radius of 16 km)
kernel radius 150 km
what's the difference?