Geography 176B Lab Home | Geography 176B Lecture Home | Geography Home | 176B Help & FAQs

LAB 5: Interpolation with IDW and Kriging
Suggested time for completion: One Week

Outline 

* Please use the double sided option when printing [File -> Print, Click on Properties, Select Print on Both Sides] 

1.0 Purpose

As part of last week's exercise, you visually explored spatial data using the Geostatistical Analyst capabilities of ArcGIS. This week you are going to use some other functions in Geostatistical Analyst along with Spatial Analyst, first to create data subsets (Geostatistical Analyst) for interpolation and then to experiment with interpolation tools (Spatial Analyst).


2.0 Introduction and Background

What you will learn about in this lab is arguably one of the most valuable capabilities of GIS: spatial interpolation, which is really a prediction technique. The idea behind interpolation is simple: estimating unknown values from a set  of known values. Real-world data samples are often expensive, sparse, and irregularly distributed. For analysis, a regular grid of estimated data values is generally preferable. 

In this lab, the first interpolation method you will use is Inverse Distance Weighting (IDW). The "inverse" part comes from the first law of Geography: all things are interrelated, but close things more so than distant things. IDW estimates grid point values from all the sample data points in a suitably "nearby" neighborhood, weighting them inversely by distance. This pdf, IDW Explained, provides a brief description of the algorithm.

The second interpolation method you will use is Kriging, which is by far the 'method of choice' among GIS technicians and statisticians because it is theoretically sound as well as practically successful. However, Kriging is not necessarily "better" than IDW or other methods in all cases; with any interpolation, success is subjective. Kriging makes use of a non-linear weighting technique, guided by general spatial characteristics of the sample data itself. This pdf, Kriging Explained, provides a brief description of the algorithm.

Beyond mathematics, there is a more "artistic" side to interpolation that a GIS technician must assess: How well are edges captured? How well are gradients represented? what happens in sparsely sampled areas? etc.

IDW and Kriging will be covered in lecture as well as in your textbook (Longley et. al., pg. 195-301)



3.0 Get & Examine Data

3.1 Census Data

Create a folder called "Lab5" on your removable disk or in C:\Workspace. Remember, ArcGIS does not like spaces in file/folder names.

Right-click on this file, cen1990sb.e00, and download it (Save Link-Target As) to your "Lab5" folder. (A regular left-click may not work because .e00 is an interchange format file that may automatically launch ArcMap.) 

Using ArcToolbox, extract the cen1990sb coverage from the Interchange File.


To effectively answer the question "Where do most of the people live?" requires a map. Making maps of gradients is easy, but choosing the appropriate method to classify the data is sometimes difficult. Depending on your map's intended purpose you may need to choose one method over another.

Open ArcMap and add the cen1990sb coverage. Open the Layer Properties for this coverage and go to the Symbology tab. 

Under Quantities choose "Graduated colors". Set the Value field as "POP1990", set the Normalization field as "AREA". Click on the Classify button and choose Natural Breaks, set the number of classes to 10 or 15, and click Apply.

By now you have some familiarity with this interface, so experiment with different visualizations of these data. There are many combinations and methods that will produce adequate visualizations of population variations.

Natural Breaks may not be the best classification method for this particular dataset. If you want, make your map according to the method you think does a better job of displaying it, and make a note in your lab about the method you used. You are encouraged to explore on your own because that is the best way to learn about your data. 
 
Question 1
What are the units of the population density map you made? (Why are legend's units so small?)

Do you think this map accurately portrays population density? (Why or why not?)

Use the Select By Attributes query tool in ArcMAP to find the items below and indicate them on Map 1.
- the area with the highest population
- the six most populated census tracts in the Santa Barbara South Coast Region
- the three most densely populated areas (POP90_SQMI > 40000)

Map 1 (on one page)

Make a map showing population density for the whole dataset.
- Do not include a legend.

- Include the classification method used to produce the map (Quantiles, Natural Breaks etc) and the appropriate parameters used for that method.

- Highlight on the map the Census tract you live in, add a textbox with some (at least 4) of the Census variables you think are important for describing your area. Include a statistic for population per square mile (divide the area of the polygon by the population).

[* If you live outside of the study area, draw a box around the area you would like to live in]

- Be sure to include the queries requested in Question 1 above

This is a lot of information to show on one map, remember the goal is to maximize the "ink to information ratio" so be judicious as to what you include and the visual organization of the graphic.

3.2 Combined Census & Parcel Values

This lab examines the relationships between demographics and property values. You have looked at the Census data. Property parcel data is even more voluminous: about 46,700 polygons made from 132,000 arc segments. (The Census data, by contrast, consists of 179 polygons made from 508 arc segments.)  For expediency, the two dataset have been preprocessed and combined for you.

The ArcMap Union command (Analysis Tools -> Overlay -> Union) was used to combine the Census data, cen1990sb, with the property parcel values, sbvalues. After the polygons were intersected, the topology was reconstructed. As you can imagine this was a very computationally intensive task.

To further increase the processing efficiency and reduce the size of the intermediate datasets you will be using, the unioned polygon coverage was converted to a point coverage, removing all of the arc information and only retaining the label points of the intersected polygons. 

Download this file, sbvalcen90pnt.zip, to your "Lab5" folder and extract (unzip) its contents. Next import the *.e00 coverage from the unzipped file. 

Notice that sbvalcen90pnt.zip file is 1.9MB, the extracted sbvalcen90pnt.e00 file is 12.6MB, and the imported sbvalcen90pnt coverage is about 7MB.  The  interchange (e00) file is larger than the coverage because the file is encoded in ASCII characters for maximum portability; it also contains topology explicitly.

Open ArcMap and add the sbvalcen90pnt coverage to a new map document. Right-click on the layer you just added (sbvalcen90pnt) and choose Open Attribute Table.

 
Question 2
How many features (points) are in sbvalcen90pnt?

How many polygons were in the original sbvalues dataset? How many polygons were in the original cen1990sb dataset?


Question 3
Why are there more features in sbvalcen90pnt than the original two coverages it was made from?




4.0 Create Data Subsets

Rather than using the entire sbvalcen90pnt coverage for our interpolations, we will create smaller, more manageable subsets of the data to use in this exercise. The procedure will be the same one that you used in Lab 4 for subsetting.  You can close the attribute table if it is still open. 

Go to the Tools menu and select Extensions.

In the Extensions dialog box, make sure that the Geostatistical Analyst and Spatial Analyst extensions are both checked. Click Close. Next, go to View -> Toolbars and make sure that the same two toolbars are also checked there. (If they already appear in the ArcMap window toolbar, you can skip this last step.)

Now we will make subsets of the sbvalcen90pnt coverage. Go to the Geostatistical Analyst menu and choose Create Subsets.

 

First make a 1% a random sample from the sbvalcen90pnt coverage and save this in a free-standing geodatabase, according to the dialog box below. Simplify the geodatabase name to "sbvalcen90pnt_sets1" (see illustration below). It may take ArcGIS a while to do this processing; be patient.  When prompted to add the results to your map, select No.

Similarly, make a 2% random sample in another geodatabase. Change the geodatabase name to "sbvalcen90pnt_sets2".  When asked if you would like to add the results to your map, select No.


 

Close ArcMap without saving and open ArcCatalog.  Go to your "Lab5" folder.  If ArcCatalog is already open, you may need to refresh the view to see your new geodatabases.  To do so, right click on your Lab5 folder and choose refresh from the context menu.  Right-click on the "sbvalcen90pnt_point_training subset" (feature class) within the sbvalcen90pnt_sets1 geodatabase and choose Export -> To Shapefile (single). Make sure you are exporting from the correct geodatabase and feature class! Set the Output Location as "Lab5", and name the output file sbvalcen90_1.

(Note: Even though the window refers to "feature class", because you are not exporting the file into another geodatabase, it will indeed be saved as a shapefile. Recall from Lab 2 that in a conceptual sense shapefiles are freestanding feature classes.)

Similarly, repeat the Export process for the "sbvalcen90pnt_point_training" subset within the sbvalcen90pnt_sets2 geodatabase, naming the output coverage sbvalcen90_2.

Once you have created the shapefiles from the 1% and 2% random samples, do some cleaning up in the "Lab5" folder. Still within ArcCatalog, delete both of the geodatabases that we produced while subsetting.
 



5.0 IDW Interpolation

 

The purpose of this section is to explore the capabilities and limitations of IDW as an interpolation method. You will also make some determinations about sample size and reliability -- more is not always necessarily that much better. 


IDW Explained  [* read this before proceeding]

 
Question 4
To what does "Power" refer?

What are the options with Search radius? (Variable or Fixed) What is the difference?


5.1 Procedure

Open ArcMAP and add the two random sample point shapefiles, sbvalcen90_1 and sbvalcen90_2, to a new map document.

Highlight only sbvalcen90_1 in the ArcMap Table of Contents window, under the Spatial Analyst pulldown menu, go to Interpolate to Raster -> Inverse Distance Weighted.

In the window that opens you are asked to specify the parameters for the Inverse Distance interpolation. The sbvalcen90_1 shapefile should be specified as the Input. The Z-value you will use as the input for IDW is the COMBINEVAL attribute. Name the Output Raster "idw_1" (with "1" designating its source file) and to be saved in your Lab5 folder.

When you unzipped sbvalcen90pnt.zip, a shapefile called clipbnd was included. This shapefile will be used here to mask the interpolation so that it does not extend beyond the limits of the dataset. In fact, clipbnd only limits the extent of the results as displayed ; the interpolation per se covers a full rectangular grid. In the window, check the box next to "Use barrier polylines" and then browse to clipbnd as the file to use.

Set the rest of the parameters to the values you see in the graphic below, and click OK.

When the interpolation has finished, the result grid will be added to your layout with nine colors (the default categorization) in the legend. 

You need to increase the number of classes in the classification in order to see some of the variation in the IDW interpolation. To do this double-click on the IDW grid and open the Layer Properties window. Change the number of classes to 32 and click Apply.

Next click on the Classify button in the Layout Properties window. Choose Natural Breaks in the Classification window and click OK. Click Apply in the Layout Properties Window and OK to close it.


 

*** Now repeat the above steps for the sbvalcen90_2 shapefile. ***
 
 
If you try different settings, realize that what you are doing (interpolation) is extremely computationally intensive if you choose parameters that result in larger sets of points in the interpolation kernel. Read about the different options in IDW Explained.

Question 5
What do you think is your major source of error in the IDW interpolation?

- discuss the interpolation method itself as well as what (real estate value) you are trying to predict in terms of potential limitations to the reliability of the interpolation.

Does using more points improve the IDW interpolation? (why or why not?)

IDW Interpolation - Maps 2 & 3 (two sheets)

Make a map of the IDW interpolation for sbvalcen90_1
- make a layout showing the whole extent

- make a zoomed-in view of the downtown area showing some of the interpolated areas in detail (your extent should cover the approximate area covering the Mesa, the Harbor and Downtown area)

- put on the layout the number of points used and some summary statistics describing the sample

Make a map of the IDW interpolation for sbvalcen90_2
- make a layout showing the whole extent

- make a zoomed-in view of the downtown area showing some of the interpolated areas in detail (your extent should cover the approximate area covering the Mesa, the Harbor and Downtown area)

- put on the layout the number of points used and some summary statistics describing the sample




6.0 Kriging

Kriging is the most statistically robust method of prediction. Like IDW, kriging uses a measure of autocorrelation that is based on distance, but kriging also incorporates variability of autocorrelation in the neighborhood of each point.

Kriging Explained [* read this before proceeding]
 
Question 6
What is the difference between Ordinary and Universal kriging?

What is a variogram?
What are the different Semivariogram models?

6.1 Procedure

With sbvalcen90_1 highlighted, under the Spatial Analyst pulldown menu, go to Interpolate to Raster -> Kriging.

In the window that opens, the sbvalcen90_1 point coverage should be set as the Input. The Z-value you will use for kriging is POP90_SQMI (note the change). Name the Output Raster "krig_1" to be saved in your Lab5 folder.

Kriging is somewhat of an art because there are a lot of ways to modify the interpolation based on your knowledge of the variation in the data, what you are expecting and even some luck in hitting the right combination.

To start you off try the following parameters.

The results should look something like the following screenshot. Kriging does not allow for a boundary mask, as IDW does, so your interpolation may look "weird" where it extends beyond valid data points.

Experiment with different interpolation parameters to try to improve your prediction. The dataset you are working with is not ideal for kriging, but there are combinations of  parameters that will produce a realistic grid/surface of POP90_SQMI.

You can judge how well you are doing by loading the sbvalcen90pnt point coverage OR you can have the kriging interpolator "Create variance of prediction".

*** Now repeat the above steps for the sbvalcen90_2 shapefile. ***

   
If  you try different settings, realize that what you are doing (interpolation) is extremely computationally intensive, especially kriging. Read about the different options in Kriging Explained.

Question 7
What do you think is your major source of error in the Kriging?

- discuss the interpolation method itself as well as what (population density) you are trying to predict in terms of potential limitations to the reliability of the interpolation

Does using more points improve the Kriging interpolation? (why or why not?)

Question 8
How do these two interpolation methods (IDW and Kriging) perform near the edges of the study area?
(Consider the Help files, the pdf documents, and your observations in formulating your answer.)

Kriging - Maps 4 & 5 (two sheets)

Make a map of the Kriging for sbvalcen90_1
- make a layout showing the whole extent

- make a zoomed-in view of the downtown area showing some of the interpolated areas in detail (your extent should cover the approximate area covering the Mesa, the Harbor and Downtown area)

- put on the layout the number of points used and some summary statistics describing the sample

Make a map of the Kriging for sbvalcen90_2
- make a layout showing the whole extent

- make a zoomed-in view of the downtown area showing some of the interpolated areas in detail (your extent should cover the approximate area covering the Mesa, the Harbor and Downtown area)

- put on the layout the number of points used and some summary statistics describing the sample




7.0 Summarize what you have learned
 
Question 9
Compare and contrast IDW and Kriging in an essay (not more than 1 page), refer to your maps and describe some of the physical sources of error and the material covered in your book and the attached pdf documents. 



8.0 To turn in
  • The question sheet, with typed answers and essay (Word document)
  • 5 Maps: Census (1), IDW (2), and Kriging (2)

Created by Sean Benison, Sunhui Sim, and Jordan Hastings

Based on previous lab by Sarah Battersby and Jeff Hemphill

UC Santa Barbara, Department of Geography

© 2000-2005 Regents of the University of California

This page was last modified on Feb. 25, 2008 by Indy Hurt