| Geography
176B Lab Home | Geography 176B
Lecture Home | Geography Home
| 176B
Help & FAQs LAB 5: Interpolation with IDW and Kriging Time for completion: One Week Outline
* Please use the double sided option when printing [File -> Print, Click on Properties, Select Print on Both Sides] As part of last week's exercise,
you visually explored spatial data using the Geostatistical Analyst
capabilities of ArcGIS. This week you are going to use some other
functions in Geostatistical Analyst along with Spatial Analyst, first
to create data subsets for interpolation (Geostatistical Analyst) and then to experiment with interpolation tools (Spatial
Analyst). 2.0 Introduction and Background What you will learn about in this lab is arguably one of the most valuable capabilities of GIS: spatial interpolation. The idea behind interpolation is simple: estimating unknown values from a set of known values. For analysis, a regular grid of estimated data values is generally preferable. However, real-world data samples are often expensive, sparse, and irregularly distributed. Thus, spatial interpolation attempts to make a reasonable estimate of unknown observations based upon the principle of spatial autocorrelation. There are several interpolation methods, each of which will almost always produce distinct results. In this lab, the first interpolation method you will use is Inverse Distance Weighting (IDW). The "inverse" part comes from the first law of Geography: all things are interrelated, but close things more so than distant things. IDW estimates grid point values from all the sample data points in a suitably "nearby" neighborhood, weighting them inversely by distance. This pdf, IDW Explained, provides a brief description of the algorithm. The second interpolation method you will use is Kriging, which is by far the 'method of choice' among GIS technicians and statisticians because it is theoretically sound as well as practically successful. However, Kriging is not necessarily "better" than IDW or other methods in all cases; with any interpolation, success is subjective. Kriging makes use of a non-linear weighting technique, guided by general spatial characteristics of the sample data itself. This pdf, Kriging Explained, provides a brief description of the algorithm. Beyond mathematics, there is a more "artistic" side to interpolation that a GIS technician must assess: How well are edges captured? How well are gradients represented? What happens in sparsely sampled areas? etc. IDW and Kriging will be covered
in lecture as well as in your textbook (Longley et. al., pg. 333-337) Create a folder called "Lab5" on your removable disk or in C:\Workspace. Remember, ArcGIS does not like spaces in file/folder names. Right-click on this file, cen1990sb.e00, and download it (Save Link-Target As) to your "Lab5" folder. (A regular left-click may not work because .e00 is an interchange format file that may automatically launch ArcMap.) Using ArcToolbox, extract the cen1990sb
coverage from the Interchange File. Open ArcMap and add the cen1990sb coverage. Open the Layer Properties for this coverage and go to the Symbology tab. Under Quantities choose
"Graduated colors". Set the Value field as "POP1990", set the
Normalization field as "AREA". Click on the Classify button and choose
Natural Breaks, set the number of classes to 10 or 15, and click Apply. By now you have some familiarity with this interface, so experiment with different visualizations of these data. There are many combinations and methods that will produce adequate visualizations of population variations. Natural Breaks may not be the
best classification method for this particular dataset. If you want,
make your map according to the method you think does a better job of
displaying it, and make a note in your lab about the method you used.
You are encouraged to explore on your own because that is the best way
to learn about your data.
3.2 Combined Census & Parcel Values This lab examines the relationships between demographics and property values. You have looked at the Census data. Property parcel data is even more voluminous: about 46,700 polygons made from 132,000 arc segments. (The Census data, by contrast, consists of 179 polygons made from 508 arc segments.) For expediency, the two dataset have been preprocessed and combined for you. The ArcMap Union command (Analysis Tools -> Overlay -> Union) was used to combine the Census data, cen1990sb, with the property parcel values, sbvalues. After the polygons were intersected, the topology was reconstructed. As you can imagine this was a very computationally intensive task. To further increase the processing efficiency and reduce the size of the intermediate datasets you will be using, the unioned polygon coverage was converted to a point coverage, removing all of the arc information and only retaining the label points of the intersected polygons. Download this file, sbvalcen90pnt.zip, to your "Lab5" folder and extract (unzip) its contents. Next import the *.e00 coverage from the unzipped file. Notice that
sbvalcen90pnt.zip file is 1.9MB, the extracted
sbvalcen90pnt.e00 file is 12.6MB, and the imported sbvalcen90pnt
coverage is about 7MB. The interchange (e00) file is larger
than the coverage because the file is encoded in ASCII characters for
maximum portability; it also contains topology explicitly.
4.0 Create Data Subsets Rather than using the entire sbvalcen90pnt coverage for our interpolations, we will create smaller, more manageable subsets of the data to use in this exercise. The procedure will be the same one that you used in Lab 4 for subsetting. You can close the attribute table if it is still open. Go to the Tools menu and select Extensions. In the Extensions dialog box,
make sure that the Geostatistical Analyst and Spatial Analyst
extensions are both checked. Click Close. Next, go to View ->
Toolbars and make sure that the same two toolbars are also checked
there. (If they already appear in the ArcMap window toolbar, you can
skip this last step.)
First make a 1% a random sample from the sbvalcen90pnt coverage and save this in a free-standing geodatabase, according to the dialog box below. Simplify the geodatabase name to "sbvalcen90pnt_sets1" (see illustration below). It may take ArcGIS a while to do this processing; be patient. When prompted to add the results to your map, select No.
Similarly, make a 2% random sample in another geodatabase. Change the geodatabase name to "sbvalcen90pnt_sets2". When asked if you would like to add the results to your map, select No.
Close ArcMap without saving and open ArcCatalog. Go to your "Lab5" folder. If ArcCatalog is already open, you may need to refresh the view to see your new geodatabases. To do so, right click on your Lab5 folder and choose refresh from the context menu. Right-click on the "sbvalcen90pnt_point_training subset" (feature class) within the sbvalcen90pnt_sets1 geodatabase and choose Export -> To Shapefile (single). Make sure you are exporting from the correct geodatabase and feature class! Set the Output Location as "Lab5", and name the output file sbvalcen90_1. (Note: Even though the window refers to "feature class", because you are not exporting the file into another geodatabase, it will indeed be saved as a shapefile. Recall from Lab 2 that in a conceptual sense shapefiles are freestanding feature classes.)
Similarly, repeat the Export process for the "sbvalcen90pnt_point_training" subset within the sbvalcen90pnt_sets2 geodatabase, naming the output coverage sbvalcen90_2. Once you have created the
shapefiles from the 1% and 2% random samples, do some cleaning up in
the "Lab5" folder. Still within ArcCatalog, delete both of the
geodatabases that we produced while subsetting.
The purpose of this section is to explore the capabilities and limitations of IDW as an interpolation method. You will also make some determinations about sample size and reliability -- more is not always necessarily that much better.
Open ArcMAP and add the two random sample point shapefiles, sbvalcen90_1 and sbvalcen90_2, to a new map document. Highlight only sbvalcen90_1 in the ArcMap Table of Contents window, under the Spatial Analyst pulldown menu, go to Interpolate to Raster -> Inverse Distance Weighted.
In the window that opens you are asked to specify the parameters for the Inverse Distance interpolation. The sbvalcen90_1 shapefile should be specified as the Input. The Z-value you will use as the input for IDW is the COMBINEVAL attribute. Name the Output Raster "idw_1" (with "1" designating its source file) and to be saved in your Lab5 folder. When you unzipped sbvalcen90pnt.zip, a shapefile called clipbnd was included. This shapefile will be used here to mask the interpolation so that it does not extend beyond the limits of the dataset. In fact, clipbnd only limits the extent of the results as displayed ; the interpolation per se covers a full rectangular grid. In the window, check the box next to "Use barrier polylines" and then browse to clipbnd as the file to use. Set the rest of the parameters to the values you see in the graphic below, and click OK.
When the interpolation has
finished, the result grid will be added to your layout with nine colors
(the default categorization) in the legend.
You need to increase the number of classes in the classification in order to see some of the variation in the IDW interpolation. To do this double-click on the IDW grid and open the Layer Properties window. Change the number of classes to 32 and click Apply.
Next click on the Classify button in the Layout Properties window. Choose Natural Breaks in the Classification window and click OK. Click Apply in the Layout Properties Window and OK to close it.
***
Now repeat the above steps for the sbvalcen90_2 shapefile. ***
Kriging is the most statistically robust method of prediction. Like IDW, kriging uses a measure of autocorrelation that is based on distance, but kriging also incorporates variability of autocorrelation in the neighborhood of each point. Kriging Explained [* read
this before proceeding]
With sbvalcen90_1 highlighted, under the Spatial Analyst pulldown menu, go to Interpolate to Raster -> Kriging. In the window that opens, the sbvalcen90_1 point coverage should be set as the Input. The Z-value you will use for kriging is POP90_SQMI (note the change). Name the Output Raster "krig_1" to be saved in your Lab5 folder. Kriging is somewhat of an art because there are a lot of ways to modify the interpolation based on your knowledge of the variation in the data, what you are expecting and even some luck in hitting the right combination. To start you off try the following parameters. The results should look something like the following screenshot. Kriging does not allow for a boundary mask, as IDW does, so your interpolation may look "weird" where it extends beyond valid data points. Experiment with different interpolation parameters to try to improve your prediction. The dataset you are working with is not ideal for kriging, but there are combinations of parameters that will produce a realistic grid/surface of POP90_SQMI. You can judge how well you are doing by loading the sbvalcen90pnt point coverage OR you can have the kriging interpolator "Create variance of prediction". *** Now repeat the above steps for the sbvalcen90_2 shapefile. ***
7.0 Summarize what you have learned
8.0 To turn in
Created by Sean Benison, Sunhui Sim, and
Jordan Hastings This page was last modified on Jan. 13, 2009 by Susan Tran |