|
Geography
176B Lab Home | Geography
176B Lecture Home | Geography
Home | 176B
Help & FAQs LAB 4: Spatial Analysis Tme for completion: Two Weeks Outline
* Please
use the double sided option when
printing [File -> Print, Click on Properties, Select Print on Both Sides]
Although simple functions such as the cataloging
and display of geographic data are very useful, the true
power of GIS lies in its capabilities for spatial
analysis. In this lab you will gain experience with two of ArcGIS's
tools for spatial analysis: Geostatistical Analyst and ModelBuilder. Exercise I 2.0 Spatial Analysis of Santa Barbara Property Values As GIS developers have increasingly attempted to make GIS an all-around tool for spatial analysis, they have added functionality beyond the standard geoprocessing abilities usually associated with GIS. An example of this in ArcGIS is the Geostatistical Analyst extension. Geostatistical Analyst enables users to draw upon methods from spatial statistics to analyze their geographic datasets. These include histograms, covariance plots, Voronoi maps, etc. In this exercise, you will use one of the
functions of Geostatistical Analyst -- trend analysis -- to gain insight
into the factors affecting property values in Santa Barbara. Create a folder on your removable disk (or in C:\Workspace) called "Lab4" and download the datasets below. You will be using two datasets for Exercise I. The first one is a polygon coverage of owned parcels in the Santa Barbara South Coast Region with an attribute for their value, and the second is a digital elevation model of the same area. Download sbvalues.zip to your "Lab4" folder and extract the contents. Download usgs30mdem.zip
to your "Lab4" folder and extract the contents.
After you have extracted the data files, in "Lab4" you should have two e00 files (sbvalues.e00 and usgs30mdem.e00). In the next step you will import these files. From ArcCatalog, open ArcToolbox and go to Coverage Tools -> Conversion -> To Coverage, then double-click on Import from Interchange file. A dialog box will open asking you to specify the Feature Type, Input file and the Output dataset. The feature type will be AUTO each time you run the tool, and inputs are sbvalues.e00 and usgs30mdem.e00 and the output should have the same names (and path) minus the .e00 extension. Note: This particular tool is not available with the student license of ArcGIS Desktop.
In ArcCatalog you should now see two spatial
datasets in "Lab4": A grid called "usgs30mdem" and
a coverage called "sbvalues".
For this lab we will be working with some relatively large datasets. First you will adjust a setting.
Open ArcMap and add the usgs30mdem dataset. Click yes if prompted to create pyramids. This option will speed up the process of rendering the raster dataset when you change extents. Be patient. The data may take a moment to appear. Next, add the sbvalues coverage. You may also get a warning about an inconsistent extent. Carefully read the warning and click OK to dismiss it for now. You may also see an additional geographic coordinate system warning. Carefully read this warning as well and click Close. Notice that you are unable to see the sbvalues polygons in the map display. The warnings indicate an incompatible extent. Right click the usgs30mdem and choose Data -> View Metadata from the context menu. Navigate to the Spatial tab of the metadata window and expand the Details for the dataset. Next, open the metadata for the sbvalues coverage and compare the horizontal coordinate system information listed for both datasets. Here is where you will discover an undefined geographic coordinate system definition for the sbvalues coverage listed as GCS_User_Defined. There two datasets were likely meant to occupy relatively the same extent and utilize the same spatial reference information. In prior releases of the software, ArcMap would allow datasets to render without adequate spatial reference information. This is no longer the case. In this next step, you will assign the same spatial reference information recorded for the usgs30mdem to the coverage. Close ArcMap without saving. If necessary, open ArcCatalog and right click the sbvalues coverage. Select Properties from the context menu and choose the Projection tab. With the Display Spatial Reference option selected, click the Define button. The following warning will appear. Click OK.
A Define Projection Wizard dialog box will appear. Select the bottom option to Define a coordinate system for my data to match existing data - Matches the coordinate system of an existing coverage or grid. See graphic below:
On the next screen that asks you to choose a dataset with the coordinate system you want to use, browse to the usgs30mdem grid, select it and click Open, and click next. The final screen show look like the graphic below:
Right-click on the "sbvalues" coverage and choose Open Attribute
Table.
You should have the attribute tables of
sbvalues and usgs30mdem open now in separate windows.
Close the statistic window. Under the Selection pulldown menu, choose Select By Attributes.
With the Select By Attributes window open,
separately query for "COMBINEVAL" > 750000, then for "COMBINEVAL" > 1000000,
and finally for "COMBINEVAL" > 60000000. Make
sure these are individual queries rather than cumulative ones.
In this portion of the lab you will use
but one of many statistical functions built in to ArcGIS. This capability
is very useful for science as well as business because with these tools
you can uncover hidden relationships and produce new knowledge.
In ArcMap under the Tools pulldown menu (located along the top row of options in ArcMap) go to Extensions. In the Extensions dialog box make sure that Geostatistical Analyst is checked. Next, go to View -> Toolbars and select Geostatistical Analyst from the side menu to load the toolbar.
With the sbvalues coverage highlighted, go to the Geostatistical Analyst pulldown button and choose Create Subset.
In the first window that opens, 'sbvalues polygon' should be entered as the Input Layer you want to subset. Click Next. In the next window you need to move the slider over to the left until you have a 2% Training sample (fewer than 1000 polygons).
This will produce two new datasets contained in a geodatabase. The output path for the two subsets and the geodatabase should point to your "Lab4" folder, and subsets should be named appropriately. Check to make sure this is so. Click Finish and ArcGIS will randomly subset the polygon coverage. This could take a while so be patient. After it is done a window will open asking
you if you want to add the new dataset to the layout, click Yes.
Uncheck the usgs30dem, sbvalues and the newly created "sbvalues_polygon_test" coverage. Leave only the "sbvalues_polygon_training" coverage checked so you can see the distribution of polygons in our sample.
With "sbvalues_polygon_training" highlighted, click on the Geostatiscal Analyst button, go down to Explore Data and then choose Trend Analysis.
This will open a new window. Along the bottom of the window you need to change the Attribute to COMBINEVAL (the real estate values). You should first modify the Trend Analysis
display to make it more readily interpretable. In the Graph Options listing,
uncheck Sticks and Input Data Points. You can also uncheck the Legend
(upper right) as it is unnecessary. The 3D graph can also be rotated using
the wheels along the lower right corner of the graph window. Click the Add to Layout button and it will
be imbedded in your ArcMap layout.
Exercise II 3.0 Developing a Predictive Model with ModelBuilder 3.1 Introduction and Background ModelBuilder provides a graphic modeling framework for geoprocessing tasks. Traditionally, a GIS user faced with the task of having to run a series of geoprocessing operations would simply load each related tool independently and run them one at a time. If the user wanted to repeat the same procedure, he or she would have to start all over again and repeat the steps. Likewise, no one else could easily repeat the procedure without being told by the original user what the steps were. ModelBuilder provides a means of creating a flow diagram that represents all of the operations that make up a procedure (including their inputs and outputs) and allows them to be repeated simply by running the ModelBuilder construct again. ModelBuilder offers three main benefits: 1) It records all of the steps involved in a procedure; 2) It allows the procedures to be easily repeated and shared with others; and 3) It provides a visual representation to help with understanding what is going on in the procedure. In this exercise, you will use ModelBuilder for exploratory analysis designed to aid in the creation of a predictive model for Mayan archaeological sites. In the context of archaeology, a predictive model is "a tool that indicates the probability of encountering an archaeological site anywhere within a landscape" (from Minnesota Archaeological Predictive Model). Developing a predictive model essentially consists of trying to determine the logic and preferences in site selection of the people who built the archaeological sites in question. Using GIS, one can examine a set of environmental factors to see if any of their possible combinations seem to be repeatedly associated with a type of archaeological site. These insights can then be used to predict where other sites are likely to exist in the landscape. To illustrate, imagine you want to find new archaeological sites with artifacts from our local Chumash tribes. Say that through exploratory analysis of previously-known sites you discover that they seemed to prefer occupying locations that are near the ocean and close to glades where white sage tends to grow. You could then analyze a region to see which areas in it meet those criteria, and direct your field searches to them in order to increase the likelihood that your archaeological dig will in fact find something. A warning regarding "model":
You may become confused by the repeated use of the term "model"
in this lab. Unfortunately, ESRI chose the name "model" for
the diagrams created by ModelBuilder, even though the procedures they
are used for will often have absolutely nothing to do with scientific
modeling! To help keep things clearer, the diagrams created by ModelBuilder
will be referred to here as "Models" (capital M), with all other
senses of the word being shown in lowercase letters. Download maya_gd.zip to your "Lab4" folder and extract the "Maya" geodatabase from the file. All datasets except the soils layer are
taken from the UCSB
Maya Forest GIS, a project of the MesoAmerican
Research Center. The soils dataset was created by Jorge
Sifuentes, UCSB Geography. Before moving on to the Model Builder procedures, we must first prepare some of the data sets to be used. Open ArcMap if you have not already done so with a new empty map, and add all of the feature classes from the "Maya" geodatabase to your map. Step 1: Sample archaeological sites To randomly sample the archaeological sites, we will use the same procedure that we did to sample the Santa Barbara real estate parcels. (If you have closed the Geostatistical Analyst toolbar since then, please turn it on again.) First only highlight "arch_sites" in the Table of Contents, then go to the Geostatistical Analyst pulldown button and choose Create Subset. In the first window that opens, "arch_sites" should be entered as the Input Layer you want to subset. Click Next. Move the slider in the Create Subsets window over to the left until you have a training set size of 28 samples, or 14%. Make sure that the Output Personal Geodatabase is set to your "Maya.mdb" geodatabase (otherwise it will put the subsets into a new, separate one).
Click Finish when everything is properly set. When prompted to add the new data layers to the layout, click Yes. To help prevent confusion later on, remove
"arch_sites" and "arch_sites_test" from the Table
of Contents. Leave "arch_sites_training" in the listing.
Step 2: Create clip outline We will now create a new feature class
consisting of the boundary of our northern Guatemala study area. Make
sure that only "country_bnd" is checked and visible. Using the Selection
Features tool from the Tools toolbar
Now right-click on "country_bnd". Choose Data -> Export Data from the pop-up menu. Make sure that the layer is saved as a feature class and will be in your "Maya.mdb" geodatabase. Name the feature class "guat_outline". Click OK, then click Yes when asked about adding the layer to the display. Remove "country_bnd" from the
Table of Contents to avoid clutter, as we will no longer need it.. This next exercise must be completed in lab due to a number of tools that are unavailable with the student license. The following procedure will guide you through the process of building a model that will only be saved if you save the map document. If you close ArcMap without saving, your Maya Toolbox and model cannot be recovered. Models can be saved in one of three places: A toolbox created inside of ArcToolbox saved in a particular ArcMap document, a toolbox created in ArcToolbox accessed through ArcCatalog, or a Toolbox created inside of a geodatabase. ModelBuilder is accessed through ArcToolbox -- it is one of the ways through which users can customize ArcToolbox for their own needs. A ModelBuilder Model can in fact be thought of as an ArcTool created by the user. The basic procedure for creating a new, empty Model is to first create a new Toolbox, and then create a new Model within that Toolbox. Open ArcToolbox. Right-click on the ArcToolbox heading at the very top of the list and choose New Toolbox from the pop-up menu. Rename the new toolbox that appears in the listing to "Maya Toolbox" and it will now be listed alphabetically in your list of existing toolboxes. Next, right-click on Maya Toolbox and choose New -> Model.
A blank ModelBuilder window will open up. We will be dragging tools from ArcToolbox into this window to build the Model. We are now ready to build our Model in
ModelBuilder. Keep in mind that we are simply designing the Model
at this stage -- we will not actually be running any procedures until
the entire Model is completed. Step 1: Clipping the Rivers feature class The first step in the Model is to clip the rivers layer down to just the region of northern Guatemala on which we are focusing. Find the Clip tool under Analysis Tools -> Extract and click-and-drag it over into the Model window. A rectangle labeled "Clip" connected by a line to an oval labeled "Output" will appear in the window. Double-click on the "Clip" rectangle to open up its settings window. Set "rivers" as the input layer and "guat_outline" as the clip layer. Keep the default name of "rivers_Clip" for the output feature class. Click OK when finished. Your ModelBuilder window should now update to look like the following illustration:
Inputs are depicted as blue ovals, tool
functions as yellow rectangles, and outputs as green ovals. ModelBuilder
will automatically change new components of the Model to this color scheme
as soon as their settings are properly configured. Step 2: Measure proximity of sites to rivers Now that the rivers feature class has been reduced to our study area, we will add a procedure for determining how close the archaeological sites are located to rivers. For this we will be using the Near tool, which calculates the distance of points from line features within a given search radius. The Near tool is not available with the student license. This portion of the lab will need to be completed on campus. Find the Near tool under Analysis Tools -> Proximity and click-and-drag it over into an open area in the Model window. A rectangle labeled "Near" connected by a line to an oval labeled "Output" will appear in the window, as happened with Clip. Since there are now other objects in the
ModelBuilder window, we can set some of the parameters of the Near tool
graphically. "Rivers_Clip" is the layer from which we will be
measuring distance, and because it is present in the window, we can designate
it as feeding into the procedure by connecting it graphically with the
"Near" rectangle. Go to the top of the ModelBuilder window and
click on the Add Connection button Double-click on the "Near" rectangle to open up its settings window. You will see that "rivers_Clip" is already set as the layer for Near Features, which you should already be expecting. Set "arch_sites_training" as the input layer. (The results will be added to the existing "arch_sites_training" feature class rather than to a new output file, so there is no place to specify one.) Enter a search radius of 5000 meters. Click OK when finished.
The Near tool will determine which points
in "arch_sites_training" are within a 5 km distance of a river.
For points that are 5 km or less from a river, the specific distance (in
meters) between the point and the river will be entered in the attribute
table of "arch_sites_training" under the new NEAR_DIST column.
For points that are not, a value of 0 will be entered in the same attribute
table column instead. Step 3: Combine vegetation & soils feature classes The next few steps in constructing our
Model involve determining which environmental attributes surround each
of our archaeological sites. Conceptually, we simply need to find out
in which vegetation, soils, and aspect polygons the sites reside. In practice,
however, things are a little more complicated than this. First, there
is no single ArcGIS geoprocessing tool that can determine the answer we
need in just one step, so our analysis requires a multi-step procedure.
Second, all of these layers must be combined into a single feature
class before we can analyze them with the archaeological sites data. Third,
although you could use the Union tool (or even Intersect) to combine the
environmental attribute layers with one another, you must use the
Identity tool when you bring the archaeological sites data into
the analysis. The Identity tool is yet another advanced analysis
tool that is not available with the student license.
We will first combine the forest vegetation and soils layers into a single feature class. The Identity tool will be used in this and the following step (even though the sites are not yet involved) for simplicity and consistency. Find the Identity tool under Analysis Tools -> Overlay and click-and-drag it over into an open area in the Model window. (This part of the Model will be a separate branch for now, so it need not be close to the other objects.) Double-click on the "Identity" rectangle to open up its settings window. Set "guatfor" as the Input Features and "soils" as the Identity Features layer. Under Output Feature Class, change the output filename to "gfor_soils".
Click OK when finished. "gfor_soils" will now be shown
as the active (green) output of Identity in the ModelBuilder window. Step 4: Add aspect data We will now combine the aspect layer with the output from Step 3 into a single feature class.
Find the Identity tool under Analysis Tools -> Overlay and click-and-drag it over into an open area in the Model window (near the objects created in the previous step). Using Add Connection, draw a connection between "gfor_soils" and the "Identity(2)" rectangle. Double-click on the "Identity(2)" rectangle to open up its settings window. Set "aspect" as the Identity Features layer. Under Output Feature Class, change the output filename to "gfor_soils_asp".
Step 5: Determine site attributes Now that the three environmental attribute layers have been combined into a single feature class, we can determine the surrounding attributes for the archaeological sites. Find the Identity tool under Analysis Tools -> Overlay and click-and-drag it over into an open area in the Model window (near the objects created in the previous step). Using Add Connection, draw a connection between "arch_sites_training" and the "Identity(3)" rectangle, and between "gfor_soils_asp" and the "Identity(3)" rectangle. Double-click on the "Identity(3)" rectangle to open up its settings window. Under Output Feature Class, change the output filename to "arch_sites_ID".
Step 6: Simplify output table At this point, we are effectively done with all the necessary processing to get our desired information about the archaeological sites. However, if we were to examine the attribute table of "arch_sites_ID" at this stage, we would find it a little difficult to pick out the specific information in which we are interested. Due mainly to using the Identity tool, the attribute table now has many more attribute columns in it than the four that we want to see (distance from a river, forest vegetation type, soil fertility, and aspect). To make things easier, we can use the Frequency tool to extract only those attributes that we want and save the results in a separate (non-spatial) table. Frequency also counts up the number of incidents of each case (i.e. their frequency) which is very helpful when dealing with datasets that are too large to count up manually. Like the previous tools used in the model, the Frequency tool also requires a higher license level and will not be available with the student version of the software. Find the Frequency tool under Analysis Tools -> Statisics and click-and-drag it over into an open area in the Model window. Using Add Connection, draw a connection between "arch_sites_ID" and the "Frequency" rectangle. Double-click on the "Frequency" rectangle to open up its settings window. "Arch_sites_ID" is already set as Input Table. Accept the default Output Table filename. Under Frequency Field(s), scroll down and checkmark NEAR_DIST, DESC_, R_FERT, and ASP_CODE.
Click OK when finished. Because the Frequency table is the end
product of our Model, we should set it to be added to the display after
it is created so that we can examine it. Right-click on the green "arch_sites_ID_Frequency"
output oval, UNcheck Intermediate, and check Add to Display. Step 7: Clean up the Model display At this point your ModelBuilder window is probably looking a little messy. Because one of the purposes of a ModelBuilder Model is to serve as a visual aid for understanding the procedures involved, you should take a moment to straighten up and organize your Model before we continue. You can, of course, simply move around the various objects manually. Be sure to switch to the Select tool depicted with a black arrow so you do not inadvertently add unintentional connections. There are, however, two different tools provided by ModelBuilder to help organize the display. One is the Overview Window, accessed via the Window menu. This will bring up a small window showing you a miniature overview of your entire Model. By adjusting the bounding box in the Overview Window you can shrink or expand the size of the objects in the display. The second organizing tool is Auto Layout, located next to the Add Data button. Auto Layout will automatically organize the objects in the display; however, it prefers to stretch the Model out horizontally, which makes it difficult to view large and intricate Models in their entirety. Regardless of whether or not your cleaned-up Model matches this specific layout pattern, it should contain all of the elements shown in the illustration below: When you are done organizing the ModelBuilder window, export it to a graphic by choosing Model -> Export -> To Graphic. Choose JPEG and same it in your Lab4 folder.
Step 8: Run the Model We are now finally ready to run the Model. As a precaution, first make sure that ArcCatalog is not currently running on your PC -- if it is open while you run the Model, ModelBuilder may encounter problems. Click the Run button on the ModelBuilder
tool bar When the Model run is completed, click Close on the dialog box.
The "arch_sites_ID_Frequency" table should now appear in the Table of Contents listing. You will need to select the Source tab at the bottom of the Table of Contents in ArcMap to see the stand alone table. (If the table is lot listed on the Source tab, add it to the display from the "Maya" geodatabase.) Right-click on it and choose Open to examine the the results table. Look through the table records. FREQUENCY lists the number of occurrences of each configuration of the four other attributes (e.g. a value of 2 means that two of the sites from "arch_sites_training" had the attribute configuration listed in that row). NEAR_DIST contains the proximity of the given site to a river -- either the distance to the nearest river in meters, or a 0 if the site is not within 5 km of any river. The three environmental variables derived from our Identity operations are DESC_ (forest vegetation type), R_FERT (soil fertility), and ASP_CODE (aspect). You will notice that while the DESC_ field has meaningful descriptions of the vegetation type, the latter two have only cryptic numbers in their records. Here are keys (i.e. metadata) explaining the meaning of the values in R_FERT and ASP_CODE:
Try to develop some conclusions regarding what combinations of characteristics (if any) seem to be the most commonly occurring. How useful the Frequency column itself will be in this is unfortunately unpredictable -- it depends entirely on the variability of your particular sample. Moreover, because the NEAR_DIST field mostly contains specific, detailed distance measurements there will probably be very few recurring identical values in the column, and thus you will get many Frequency counts of 1. You will probably find it more useful to
sort the individual columns and count up the number of occurrences manually.
Do this by right-clicking on the column headings in the table and selecting
either Sort Ascending or Sort Descending. Remember, however, not to look
at each environmental attribute in complete isolation! Although you do
need to start by looking at the attribute columns individually, your ultimate
goal is to find combinations of attributes.
Created by Sean Benison Based on previous lab by Sarah Battersby
and Jeff Hemphill This page was last modified on Feb. 11, 2008 by Indy Hurt |
|||||||||||||||||||||||||||||||||||||||||||