Geography
176B Lab Home | Geography
176B Lecture Home | Geography Home
| 176B Help &
FAQs
LAB 2: GIS Data Models
Tme for completion: One Week
Outline
*
Please use the
double
sided option when printing [File -> Print, Click on Properties,
Select Print on Both Sides]
1.0 Purpose
To gain a clear understanding of what
a data model is, and why data models are important.
To learn the data models ESRI supports
in ArcGIS, and the similarities and differences between them.
To learn the advantages and disadvantages
of using certain data models for different tasks.
To reinforce basic ArcGIS skills.
2.0 Introduction and
background
Basic Concepts
You will notice some diversity in these
definitions, as they are in the context of different companies, software,
times, and degrees of specificity. For this lab, focus on the hierarchy
described in the main body of the lab.
There are three basic spatial data types
used with GIS (points, lines, and areas):
* Points represent anything that
can be described as a discrete x, y location
* Lines represent anything having a length
* Areas, or polygons, describe anything having boundaries
These data types comprise the vector model,
which is the model you will deal with most often in GIS.
Vector data model:
Discrete features, such as customer locations, are usually represented
using the vector model. Features can be discrete locations or events,
lines, or areas. Lines, such as streams or roads, are represented as a
series of coordinate pairs. Areas are defined by borders, and are represented
by closed polygons. When you analyze vector data, much of your analysis
involves working with (summarizing) the attributes in the layer's data
table.
Raster data model:
Continuous numeric values, such as elevation, and continuous categories,
such as vegetation types, are represented using the raster model. The
raster data model represents features as a matrix/lattice of cells in
continuous space. A point is one cell, a line is a continuous row of cells,
and an area is represented as continuous touching cells.
Tabular data:
Contain information describing a map feature in the form of a table or
spreadsheet. For example, a GIS database of customer locations may be
linked to address and personnel information. GIS links this tabular data
to associated spatial data.
For more information, see GIdata
models [from Universidade
Nova de Lisboa]
Question 1:
Give an example of how a continuous phenomenon can be represented
using the vector data model. |
Geographic Data
Modeling: An Introduction
Data Model - An abstraction of the
real world which incorporates only those properties thought to be relevant
to the application at hand, defines specific groups of entities and their
attributes, and states relationships between these entities. A data model
is independent of a computer system. [Association
for Geographic Information]
Data models are a crucial concept for GIS
users to understand. Data models describe how geographic features will
be represented in the GIS. Any time you wish to deal with geographic data
in a computing environment, you must choose a geographic data model by
which to do it. The choice of data model will yield benefits in terms
of simplifying real-world features enough to deal with them easily, but
will also incur costs in terms of oversimplifying or misrepresenting different
aspects of them in the process.
A paper map is an example of an analog
data model -- it is a formalized framework that cartographers use to capture
and represent information about a landscape on a sheet of paper. The same
sort of thing is also needed to capture and represent geographic information
when the medium is digital rather than ink-and-paper. In a GIS, abstractions
of real-world features must therefore be formalized into a data model
that defines how the computer will represent and manage the geographic
information (geometry and attributes).
Bernhardsen (1999) diagrams the data model
formalization process along these lines:

Figure 1: The modeling process. (after Bernhardsen 1999, p.39.
Map graphics from www.gis.com)
Most of the confusion about data models
arises from their diversity. Some data models are more abstract/theoretical
while others are made with specific database frameworks in mind. For example,
the vector data model and the raster data model are very general, whereas
the georelational data model and geodatabase data model are made to fit
specific categories of database software. Furthermore, a given data model
may belong to more than one category: a coverage is both a vector data
model (general) and a georelational data model (database specific).
The many types of data models are easier
to think about if one pictures of them as being part of a general hierarchy.
Below is a figure showing the hierarchy of ArcGIS's data models:
Figure 2: Hierarchy of ESRI's ArcGIS
data models.
The data models go from most general at
the top level (vector, raster, TIN) to most specific at the bottom level
(shapefile, coverage, geodatabase). It is important to note that a geodatabase
can handle all three general models, not just the vector model.
Geographic data models have evolved under
the influences of technology (e.g., increasing storage space and processing
power, networking, or software evolution) and even history (e.g., ESRI
introduced the "coverage" data model in 1980).
Every GIS software package will be capable
of supporting a number of data models. The capabilities of the data models
may change with new versions of the software, and compatibility issues
may arise between different GIS software, and even between different versions
of the same software. Certain functions will be accessible using data
in the form of one data model but not another.
Data Structures vs. Data Models
Once we have decided on a data model to
use, there remains the question of how to actually store this model in
the computer. The specific format to be used for storing it is known as
the data structure. To illustrate, consider a basic vector
data model. The vector model represents features as consisting of lines
which individually link together a start node, vertices in between, and
an end node. To draw and analyze features represented this way, the computer
needs information on the locations of each node and vertex of the lines.
This could be provided in the form of a table listing the coordinates
of these points, and indicating which line(s) go through them. This table
would be the basic data structure. Coverages and shapefiles
use this type of structure..
In
Figure 1 above,
the lower left box titled "DATABASE (relational tables)" represents the
data structure. In it you can see numbered rows and columns with
labels, this is the 'structure' of the data. Some columns have only numbers,
some have only text and some have both.
Several different types of data structures
can potentially be used to represent the same data model. For example,
you could represent a vector data model using coverages, shapefiles, or
geodatabases. Although these all take the same basic approach in representing
the model, there are still significant differences between them. We will
discuss what these differences are later, but for now keep in mind that
1) data models do not necessarily imply any particular data structures;
and 2) data structures can represent the same data model while still being
very different from one another.
Question 2:
a) Which data models can be stored in a Geodatabase?
b) Do CAD data use a different kind of data model or a different type
of data structure?
c) What is the difference between a data structure and a data model? |
Data Models,
Datasets, and Feature Classes in ArcGIS
In ArcCatalog the geometry and data model
of every dataset is identified by a small picture or icon. This
works much like Windows Explorer, except that only file formats recognized
by ArcCatalog as geographic data will be displayed.
Your life will be made much easier if you
learn ArcCatalog's icons. There are a lot of them and they can be initially
confusing, so here is the
handy
table from Lab #1 that you can refer back to. Below is a display
from ArcCatalog showing how the icons are identified by type.

In the frame on the right, along the top, is a folder tab called 'Contents'
which shows the contents of the folder with the icons symbolizing each data model and
type.
The folders and files that make up shapefiles,
coverages, geodatabase feature classes, rasters, and TINs fall into an
organizational hierarchy (similar but not identical to the Windows folder/file
hierarchy, beware!). This is a VERY different type of hierarchy from the
one we discussed with regard to Figure 2 and
data models. There we address the theoretical/conceptual relationship of
the geographic data model, not the "nuts and bolts" of the actual
files. With regard to ArcCatalog we are referring directly to the particular
data model and the specific data structure of each file type or file format.
Figure 3, below, shows this hierarchy of folders, data models, datasets,
and feature classes as displayed in ArcCatalog. Feature classes are the
lowest level that the user accesses.
Figure 3: Icons and hierarchy
Some file format basics:
- Shapefiles: A single
geographic feature type (counties, roads, capitals, etc.) will be
contained in a shapefile, and each shapefile corresponds to a feature
class. The geometric information (stored in hidden binary files) will
be displayed in ArcCatalog's "Preview" and the attribute
information (stored in dBASE tables) will be displayed in the "Table
Preview". This linkage of geometric files to separate attribute
tables is common to shapefiles and coverages and is called a georelational
data model by ESRI.
- Coverages: Multiple
geographic feature types will be contained in a coverage, and each
of these types corresponds to a feature class. The folder that
contains all of these feature classes is the actual coverage.
Within it, the geometric and attribute information (again stored in
hidden binary files) can be displayed using ArcCatalog's "Preview"
and "Table Preview", respectively). Like shapefiles, coverages
employ a georelational data model.
- Geodatabases:
A single geographic feature type corresponds to a feature class, as
with shapefiles. Multiple feature classes can be grouped into a feature
dataset (symbolized as three overlapping grey tiles) which
specifies a common geographic framework for all its constituent
feature classes contained within it. All feature classes
grouped inside of a feature dataset must have the same spatial
reference i.e. projection and coordinate system information (In Figure
3, for example, the "USA container" contains information about
the USA, capitals, counties etc.). Unlike shapefiles and coverages,
geodatabases employ a geodatabase data model that stores each
feature, complete with its geometry, as a row in a relational database
table. A number of feature datasets can be stored in a geodatabase.
-
- It is important to note that
there are two types of personal geodatabases supported in ArcGIS
Desktop. The first is a Microsoft Access database with a 2 GB
storage limit. In many cases, this is sufficient space to
store the data necessary for a small project. With the release
of ArcGIS 9.2, an alternative file based personal geodatabase has
been made available that uses the same data structure stored in a
folder instead of an Access file. Individual datasets in the
new file based geodatabase have a size limitation of 1 terabyte, and
the geodatabase itself has no size limitation other than the size of
your hard drive.
-
- Looking again at Figure
3, you will notice that the geodatabase, the coverages, and the
shapefiles are all contained within the folder named 'Some-Data.'
The little blue symbol on the folder indicates that it contains recognizable
geographic data in the first level beneath 'Some-Data.' In the
context of coverages, this folder is referred to as a workspace.
Remember, you will not see the blue symbol on the folder in
ArcCatalog unless you specifically set this option. If this
option is set, ArcCatalog takes longer to refresh while it searches
every single folder on your system for recognizable data. This
can be time consuming.
- Note: Do NOT use spaces in
file/folder names! Use an underscore ("_") instead. Certain
ArcGIS software components seem to need an "unbroken path"
to function correctly -- if you use spaces, you may run into problems.
This is a holdover from older Arc software even though most Windows-based
software can now handle spaces in names. An example of a
location that contains spaces in the path name is C:\Documents and
Settings\student\Desktop\176B_Lab2. Notice that there is a
space after the word "Documents" and another space after the word
"and".
Question 3:
a) The fact that you can't use spaces in file names or folders
has to do with what? (the software, data model, data structure, or
something else)
b) What is a "feature class"?
c) What is an ArcGIS "coverage" and how is it different from
a shapefile?
d) What is the main difference between the geodatabase data
model and the other data models? |
3.0 Get the data
Open My Computer and go to either
your removable disk or C:\Workspace and create a folder (right-click New
-> Folder) and name it "Lab_2" (remember - do not use spaces
in names, use an "_" or "-" instead).
Right-click, save lab_2.zip
to the C:\Workspace\Lab_2 folder (or equivalent) you just created.
Open My Computer go to C:\Workspace\Lab_2 and double-click on the file
you just saved. WinZip will open. Click the Extract button and make sure
you extract the files to C:\Workspace\Lab_2. Alternatively, you
can right click the folder and choose "Extract Here" if WinZip is not
available. After you successfully extract
the files you can close WinZip. You can view the data in ArcCatalog to
verify that the following files and folders have been extracted.
/mystery -- Contains 8 data layers
of several features in different data models. You will be figuring out
what these are in the lab.
/sb
roads -- Santa Barbara county roads coverage,
clipped to the Goleta-Santa Barbara region
sbdem -- digital elevation model of Santa
Barbara county
sbtin -- TIN derived from sbdem
sbcontour -- Contour coverage derived from
sbdem
cacounties -- counties of California,
from the GDT dataset
The street data we are using in this class
was provided by Tele Atlas
4.0 Procedure
4.1 Understanding
data models
Question 4:
As you work through the lab, fill out Tables A and B below based on
information from the lab introduction, exercises, course text, and
lecture. If time is short, you may want to leave some of the
tables to fill out outside of section. |
Table A.
| |
Vector |
Raster |
TIN |
|
Briefly describe
the essential characteristics of each.
|
|
|
|
|
Include the
types of data generally represented
(i.e. continuous
or discontinuous)
|
|
|
|
| Give
an example of a likely geographic feature that would be represented. |
|
|
|
Table B.
| |
Geodatabases
|
Coverages
|
Shapefiles
|
|
Historic Software
Origin:
|
ArcInfo8
|
|
|
| How
the data is stored in the computer (i.e. does the data need to be
in a special type of folder? What files are required for the
data model?) |
|
|
No
special folder for storage. Three files containing spatial and
attribute data are required, there may be other files with index information. |
| In
what type of files are the attributes stored? |
|
INFO
files |
|
| Describe
the topological features in each data model |
Allows
for topological feature classes, geometric networks. Polygon
topology implemented through on-the-fly topological editing.
|
|
|
| What
type of data can be created in each data model? |
|
Points,
arcs, lines, linear measurement system, polygon, regions, tics, nodes,
annotation |
|
| ArcGIS Help
ArcGIS Help works like any Windows
program help section. This is THE MOST important resource you will
have for this class and in the future, read it and learn how to
use it. Go to Menu Bar -> Help -> ArcGIS Help.
When you're looking for something
in ArcGIS Help, make sure to Search in both the Index and the Search
tab. Trying the search with different terms (e.g., data models, or
coverage, or geodatabase) increases the odds of finding something
useful. support.esri.com is another excellent resource
as well as the
GIS Dictionary.
|
Question 5:
Use ArcGIS Help to find coverages to answer the following
questions.
a) List the feature classes that a coverage can contain.
b) What is the purpose of an INFO table?
c) What are tic points?
d) What is planar topology?
Use ArcGIS Help to find shapefiles to answer the following questions.
e) How many feature classes can a shapefile use?
f) Do shapefiles have planar topology?
|
4.2 Mystery
Models
Copy the mystery and sb directories
onto your removable disk or to your Lab_2 folder in C:\Workspace if you
have not done so already. Connect
to this folder in ArcCatalog and examine the layers in the folder mystery.
Question 6:
What are the data models for each of the layers? What geographic feature
does each layer seem to represent? (be
as specific as possible)
mystery1 --
mystery2 --
mystery3 --
mystery4 --
mystery5 --
mystery6 --
mystery7 --
mystery8 --
|
Once you have identified the layers and
their data models, convert mystery5 into the same data model
as mystery2. You will have to figure out how to do this yourself,
but here are some hints:
| Converting
Between Data Models
You will have to use ArcToolbox
to accomplish this task. Recall that you can open ArcToolbox
by clicking on the ArcToolbox button in either ArcCatalog or ArcMap.
Find the toolbox menu that would
contain the appropriate tools, Find the appropriate sub menu for
converting data in mystery5's data model, Find the tool that
will let you convert to mystery2 's data model. You should be
able to figure out which layer to use as input. Recall that you can
drag-and-drop from ArcCatalog instead of typing or browsing.
Be sure to UNcheck the "Simplify"
option. Use the defaults for everything else.
|
Specify the output directory, give the
file a name you will remember, and run the conversion.
Take your resulting layer and display it
in ArcMap, along with mystery5.
Question 7:
a) How similar are mystery5 and your converted layer?
b) Briefly describe the major differences between the two. What is
the cause of them?
c) What source data was used to make mystery5 ? |
Go to the directory
sb.
Now, add sbcontour, sbdem, and sbtin
into ArcMap. Display just sbcontour and sbtin, and
overlay sbcontour on top of sbtin. To make the display
intelligible, you will have to change the properties for the two layers.
| Changing
Layer Properties in ArcMap
Make the sbtin layer display on top of the DEM by dragging
layers up or down in the Table of Contents which contains a list of
the layers in your map. To
change the Properties of a layer in ArcMap, right-click on sbtin
in the legend and go to Properties. Double-clicking on it
will also work.
- You get a large window with many
tabs, like this:
- Go to the Display tab.
- Change the transparency of sbtin
so that the DEM raster can be seen underneath it, and click OK.
|
| If you're curious about
making better use of Properties, the main methods are the creation
of Layers in ArcCatalog, and ArcMap's Style Manager, found in the
Menu Bar under Tools -> Styles -> Style Manager. |
You will be repeating these steps to change
a layer's properties many times throughout the quarter. You will
probably find the Properties functions very useful. ArcMap's Style Manager
is an easier way to manipulate layer properties that we will learn about
later on in the quarter but feel free to experiment with it.
Question 8:
Which of the three layers (sbdem, sbtin, sbcountour)
do you think was the original data layer? Which is "second generation"
and which is "third generation"? Why do you think this? |
4.3 Data Models
and ArcToolbox
ArcGIS continues to draw on a variety of
data models and formats for its functionality. The ArcToolbox tools reflect
this, requiring different data types for input depending on the analysis
and data management tasks at hand. This situation can be a little confusing,
but once you gain some exposure to how the tools are categorized and the
patterns of the ways they ask for inputs and function settings, you will
quickly be able to navigate through their use.
To familiarize yourself with ArcToolbox
and the input formats various tools require, find each tool listed below
and figure out what kind of input file(s) it supports (e.g., coverage,
geodatabase feature class, grid, TIN, etc.)
Finding
and Examining Tools
- Again, recall that you can open
ArcToolbox by clicking on the ArcToolbox
button in ArcCatalog or ArcMap.
- If you can't find a particular
tool in ArcToolbox, use the Search tab at the bottom of the ArcToolbox
window to search by name or description.
- Important: If you
click on a tool and get an error message referring to a license
problem, you need to switch to ArcMap to be able to open it. Some
tools are unfortunately module-specific and cannot be opened from
every ArcGIS module. In some cases, you may just need to
enable ArcGIS Desktop extensions if they have not been enabled
already. From the Tools drop down menu near the top of
ArcMap or ArcCatalog, choose Extensions and check all of the
boxes except for the Data Interoperability extension. UCSB
has licenses for all of the extensions except for this one.
- Every time you click on a tool
name, a short description displays in the bottom of the Toolbox
window. For more information on a tool, double click it to open
it and and click Show Help near the bottom right of the dialog
box if it is not already showing. You can now click your
mouse in any of the input boxes to read a description of what
the tool is looking for.
- From ArcToolbox, you can also
right-click any tool and choose help from the context menu to
open up the ArcGIS Desktop help system for that particular tool.
|
Question 9:
Find each of these tools and determine what data model type(s) (or
perhaps other file types) it takes as input:
a) Clip, Select, Intersect, &
most other Analysis Tools (all the same answer)
b) Viewshed
c) Buffer
d) Add Spatial Index
e) Float to Raster
f) Feature Class to Geodatabase
g) Raster to Other Format
h) Export to Interchange File
i) Join Info Tables
j) Create Labels
|
4.4 AATs & PATs
As discussed above, coverages have been
the standard vector data model for previous releases of Arc/INFO. With
the release of ArcGIS 8, all of the modules of Arc/INFO (Arc, ArcEdit,
Grid, Tables, ArcPlot, INFO etc) were fully integrated, and the new geodatabase
model was launched and promoted. However, coverages have been in use for
such a long time, you will undoubtedly encounter them.
Recall that coverages employ the georelational
database model. The INFO part of Arc/INFO was the relational database
manager for earlier versions of the Arc software (Arc was the name given
to the mapping component). An INFO file is a table that stores the information
associated with the geographic features of a spatially referenced dataset.
This gives a GIS the ability to manipulate information both spatially
and via standard tabular database functions. An example relational model
is when two tables share a common column. In a georelational model the
individual records in two or more tables are related through their location
in space. The polygon coverage below serves as a simple example of this
concept. The common column is often called the KEY (of ID) column and
is used to relate the tables and features.
Figure 4. Diagram showing the coverage
data structure for storing vector data.
Let's explore an attribute table that is
part of the roads coverage. Go to ArcCatalog and Preview
the data.
Previewing
Tables
- Below the preview map, locate
the Preview box:

- Change the preview option from
Geography to Table.
- You are now looking at the arc
attribute table (AAT).
Answer the question below. |
Question 10:
a) How many records are there?
b) What do FNODE# and TNODE# mean?
c) What other attribute information can you recognize or guess at
in the table (pick 3 columns)? |
For a look at polygons and Polygon Attribute
Tables (PATs), open cacounty. Explore the tables for the
arc, polygon, and region.cty coverage feature classes.
Sorting a
Column in Table Preview
- Click on any column heading in
the table you wish to sort. This should highlight the column.
- Right-click and choose Sort Ascending
or Sort Descending (as appropriate).
|
Question 11:
a) How many counties are there in California?
b) Why do the AAT, PAT, and RAT have different numbers of records?
c) Explain the relationship between arc, polygon, and
region.cty in this coverage.
d) What are the label and tic feature classes for?
Hints: To figure out
the answers, you will need to examine several of the tables. In
addition, you might want to use the Identify Tool in the Geography
Preview to query a few of the features. Also read Help.
|
Map for Lab 2:
Make a map of the greater Santa Barbara metropolitan
region with the roads coverage overlaid on the contour coverage.
You will have to choose appropriate properties for the two themes
so that map readers can tell them apart. Also,
make sure you follow the basic
principles of cartography outlined in Lab 1. Export your
map to a pdf, jpeg, gif, or bmp. |
2.5 Conclusion
In this lab, you have gained a basic understanding
of geographic data models and data modeling, and the primary data models
used in ESRI's ArcGIS 9.2 software. You have seen how the ESRI data
models are similar and different from each other, and how each has advantages
and disadvantages for certain purposes. You have gained further
experience with some basic ArcGIS 9 skills, such as changing properties
and using the help functions.
2.6 Additional
Reading
Bernhardsen, Tor. Geographic Information
Systems: An Introduction . New York: John Wiley & Sons, Inc.,
1999, pp. 37-99.
Booth, Bob. Getting Started with
ArcInfo. Redlands, CA: ESRI Press, 1999, pp. 45-56.
Minami, Michael, Sakala, Michelle, and
Wrightsell, Jennifer. Table:"Comparing the structure of vector datasets."
In Using ArcMap . Redlands, CA: ESRI Press, 1999, pg. 403.
Zeiler, Michael. Modeling Our
World: The ESRI Guide to Geodatabase Design . Redlands, CA:
ESRI Press, 1999, pp. 1-199.
Online Sources:
W01 Lecture 1:
OVERVIEW OF ARCINFO 8 (Arc8 Data models)
LECTURE2:
REPRESENTATION
LECTURE3:
STRUCTURE OF ARCINFO 8
AGI dictionary
definition of data model
FOLDOC definition of data model
2.7 To turn
in
- The question sheet, with typed answers
(Word document)
- One map of Santa Barbara roads and
contours
Created by Sean
Benison, Sunhui Sim, and Jordan Hastings
Based on previous lab by Nicholas Matzke,
Sarah Battersby and Jeff Hemphill
UC Santa Barbara, Department of Geography
© 2000-2007 Regents of the University of California
This page was last modified on January
21, 2008 by Indy Hurt
|