Lab 3.1 - GIS Data Models
Time for completion: 2 hours
Outline
1 - Background
2 - Introduction to Geographic Data Models
Figure 1: The modeling process
Figure 2: Hierarchy of ESRI's data models
2.1 - Data Structures vs. Data Models
2.2 - Data Models, Datasets, and Feature Classes
Figure 3: Icons and hierarchy
3 - Understanding data models
3.1 - Get the data models data
3.2 - ArcGIS Help
4 - Mystery Models
5 - Data models and ArcToolbox
6 - Coverage AATs and PATs
Figure 4: The Coverage
Lab 3.1 - Data Models Write-Up
There are three basic spatial data types used with GIS (points, lines, and areas):
*Points represent anything that can be described
as a discrete x, y location
*Lines represent anything having a length
*Areas, or polygons, describe anything having boundaries
These data types comprise the vector model, which is the model you will deal with most often in GIS.
Vector data model:
Discrete features, such as customer locations, are usually represented using
the vector model. Features can be discrete locations or events, lines, or areas.
Lines, such as streams or roads, are represented as a series of coordinate pairs.
Areas are defined by borders, and are represented by closed polygons. When you
analyze vector data, much of your analysis involves working with (summarizing)
the attributes in the layer's data table.
Raster data model:
Continuous numeric values, such as elevation, and continuous categories, such
as vegetation types, are represented using the raster model. The raster data
model represents features as a matrix/lattice of cells in continuous space.
A point is one cell, a line is a continuous row of cells, and an area is represented
as continuous touching cells.
Tabular data:
Contain information describing a map feature in the form of a table or spreadsheet.
For example, a GIS database of customer locations may be linked to address and
personnel information. GIS links this tabular data to associated spatial data.
| Question: 1. Give an example of how a continuous phenomenon can be represented using the vector data model. |
2 - Introduction to Geographic Data Models
Data Model - An abstraction of the real world which incorporates only those properties thought to be relevant to the application at hand, define specific groups of entities, and their attributes and the relationships between these entities. A data model is independent of a computer system [Association for Geographic Information].Any time you wish to deal with geographic data, you must choose a geographic data model by which to do it. The choice of data model will yield benefits in terms of simplifying real-world features enough to deal with them easily, but will also incur costs in terms of oversimplifying or misrepresenting different aspects. FOLDOC definition of data model
A paper map is an example of an analog data model -- it is a formalized framework that cartographers use to capture and represent information on a sheet of paper. The same sort of thing is also needed to capture and represent geographic information when the medium is digital rather than ink-and-paper. In a GIS, abstractions of real-world features must therefore be formalized into a data model that defines how the computer will represent and manage the geographic information (geometry and attributes).
Bernhardsen (1999) diagrams the data model formalization process along these lines:

Figure 1: The modeling process.(after Bernhardsen 1999, p.39. Map
graphics from www.gis.com)
Most of the confusion about data models arises from their diversity. Some data
models are more abstract/theoretical while others are made with specific database
types in mind. For example, the vector data model and the raster data model
are very general, whereas the georelational data model and geodatabase data
model are made to fit specific categories of database software. Furthermore,
a given data model may belong to more than one category: a coverage is both
a vector data model (general) and a georelational data model (database specific).
The many types of data models are easier to think
about if one pictures of them as being part of a general hierarchy. Below is
a figure showing the hierarchy of ArcGIS's data models:

Figure 2: Hierarchy of ESRI's ArcGIS data models.
The data models go from most general at the top level (vector, raster, TIN)
to most specific at the bottom level (shapefile, coverage, geodatabase). It
is important to note that a geodatabase can handle all three general models,
not just the vector model. Geographic data models have evolved under the influences
of technology (e.g., increasing storage space and processing power, networking,
or software evolution) and even history (e.g., ESRI introduced the "coverage"
data model in 1980).
Every GIS software package will be capable of supporting a number of data models, but will also have it's own proprietary format (that none of the others will read). The capabilities of the data models may change with new versions of the software, and compatibility issues may arise between different GIS software, and even between different versions of the same software. Certain functions will be accessible using data in the form of one data model but not another.
The specific format with which the data are stored on the computer is known as the data structure. To illustrate, consider a basic vector data model. The vector model represents features as consisting of lines which individually link together a start node, vertices in between, and an end node. To draw and analyze features represented this way, the computer needs information on the locations of each node and vertex of the lines. This could be provided in the form of a table listing the coordinates of these points, and indicating which line(s) go through them. This table would be the basic data structure. Coverages and shapefiles use this type of structure.
In Figure 1 above, the lower left box titled "DATABASE (relational tables)" represents the data structure. In it you can see numbered rows and columns with labels, this is the 'structure' of the data. Some columns have only numbers, some have only text and some have both.
Several different types of data structures can potentially be used to represent the same data model. For example, you could represent a vector data model using coverages, shapefiles, or geodatabases. Although these all take the same basic approach in representing the model, there are still significant differences between them; 1) data models do not necessarily imply any particular data structures; and 2) data structures can represent the same data model while still being very different from one another.
Resources relevant to Data Structures and Data Models: Fundamentals of Data Storage,Information Organization and Data Structure, and Non-spatial Database Models.
| Questions: 2.1 Which data models can be stored in a Geodatabase? 2.2 Do CAD data use a different kind of data structure? |
In ArcCatalog the geometry and data model is identified by an icon. Only file formats recognized by ArcCatalog as geographic data will be displayed. The handy table from Lab #1shows the icons and their associated "Type", the screenshot below shows ArcCatalog's view of a folder that contains GIS data.

The folders and files that make up shapefiles, coverages, geodatabase feature classes, rasters, and TINs fall into an organizational hierarchy. This is a different type of hierarchy from the one shown in Figure 2 (which shows the theoretical/conceptual relationship of the geographic data model). With ArcCatalog you are looking directly at the particular data model and the specific data structure of each file type or file format. Figure 3, below, shows this hierarchy of folders, data models, datasets, and feature classes as displayed in ArcCatalog. Feature classes are the lowest level that the user accesses.
![]()
Figure 3: Icons and hierarchy
Some file format basics:
- Shapefiles: A single geographic feature type (counties, roads, capitals, etc.) will be contained in a shapefile, and each shapefile corresponds to a feature class. The geometric information (stored in hidden binary files) will be displayed in ArcCatalog's "Preview" and the attribute information (stored in dBASE tables) will be displayed in the "Table Preview". This linkage of geometric files to separate attribute tables is common to shapefiles and coverages and is called a georelational data model by ESRI.
- Coverages: Multiple geographic feature types will be contained in a coverage, and each of these types corresponds to a feature class. The folder that contains all of these feature classes is the actual coverage. Within it, the geometric and attribute information (again stored in hidden binary files) can be displayed using ArcCatalog's "Preview" and "Table Preview", respectively). Like shapefiles, coverages employ a georelational data model.
- Geodatabases: A single geographic feature type corresponds to a feature class, as with shapefiles. Multiple feature classes can be grouped into a feature dataset (symbolized as a folder) which specifies a common geographic framework for all its constituent future classes. (In Figure 3, for example, the "USA container" contains information about the USA, capitals, counties etc.). Unlike shapefiles and coverages, geodatabases employ a geodatabase data model that stores each feature, complete with its geometry, as a row in a relational database table. A number of feature datasets can be stored in a geodatabase.
- Look at Figure 3, you will notice that the geodatabase, the coverages, and the shapefiles are all contained within the folder named 'Some-Data.' The little blue symbol on the folder indicates that it contains recognizable geographic data in the first level beneath 'Some-Data.' In the context of coverages, this folder is referred to as a workspace.
- Certain ArcGIS software components seem to need an "unbroken path" to function correctly -- if you use spaces, you may run into problems.
| Question: 3.1 The fact that you can't use spaces in file names or folders has to do with what? (the software, data model, data structure, or something else) 3.2 What is a "feature class"? What is a "feature type"? (hint: use the Help) 3.3 What is an ArcGIS "coverage" and how is it different from a shapefile? 3.4 What is the main difference between the geodatabase data model and the other data models? |
3.1 Get the data models data
Go to C:\Workspace and create a folder called "data_models_lab"
Right-click, save data models data.zip to the C:\Workspace\data_models_lab (or equivalent), with Windows Explorer find the .zip file and extract it. The data_models_lab folder should now have;
/mystery -- folder contains 8 data layers of several features using different data models. You will be figuring out what these are.
/sb
roads -- Santa Barbara county roads coverage, clipped to the Goleta-Santa Barbara region
sbdem -- digital elevation model of Santa Barbara county
sbtin -- TIN derived from sbdem
sbcontour -- Contour coverage derived from sbdem
cacounties -- counties of California, from the GDT dataset
The street data we are using in this class was provided by GDT
| Question: 4. Fill out Tables A and B below based on information from the lab introduction, exercises, course text, and lecture. Although it would be easiest to do this now, as you've just read out it, you can do this later. |
Table A.
| Vector | Raster | TIN | |
Briefly
describe the essential characteristics. |
|||
Include
the types of data generally represented
(i.e. continuous
or discontinuous) |
|||
| Give an example of a likely geographic feature that would be represented. |
Table B.
Historic
Software Origin: |
ArcInfo8 |
||
| How the data is stored in the computer (i.e. does the data need to be in a special folder? What files are required for the data model?) | No special folder for storage. Three files containing spatial and attribute data are required, there may be other files with index information | ||
| In what type of files are the attributes stored? | INFO files (tables) | ||
| Describe the topological features in each data model | Allows
for topological feature classes, geometric networks. Polygon topology
implemented through on-the-fly topological editing |
||
| What type of data can be created in each datamodel? | Points, arcs, linear/aerial measurements, polygon, regions, tics, nodes, annotation |
ArcGIS Help works like any Windows program help section. This is THE MOST important resource you will have, read it and learn how to use it.
Go to Menu Bar -> Help -> ArcGIS Help.

When you're looking for something in ArcGIS Help, make sure to Search in both the Index and the Search tab. Trying the search with different terms (e.g., data models, or coverage, or geodatabase) increases the odds of finding something useful. ArcOnline is an excellent resource as well.

| Question: 5.1 Use ArcGIS Help to find "coverages, described" to answer the following questions. a) List the feature classes that a coverage can contain. b) What is the purpose of an INFO table? c) What are tic points? d) What is planar topology? 5.2
Use ArcGIS Help to find "shapefiles, described" to answer the following
questions. |
Using ArcCatalog connect to the mystery folder your extracted into the data_models_lab folder.
Look in the mystery7 folder. This is your hassle of the day, figure-out what GIS data format this is and import it (hint: it's a dem).
| Question: 6. What are the data models for each of the layers? What geographic feature does each layer seem to represent? (be as specific as possible) mystery1 -- |
Once you have identified the layers and their data models, convert mystery5 into the same data model as mystery2. You will have to figure out how to do this yourself. Find the toolbox menu that would contain the appropriate tools, Find the appropriate sub menu for converting data inmystery5's data model, Find the tool that will let you convert to mystery2''s data model. You should be able to figure out which layer to use as input.
After you get it converted, display the converted layer in ArcMap, along with mystery5.
| Question: 7.1 How similar aremystery5and your converted layer? 7.2 Briefly describe the major differences between the two. 7.3 What source data was used to makemystery5? |
Start a new empty ArcMap Layout. Go to to the directory (folder)sb and add sbcontour, sbtin and sbdem (in the order, sbdem on the bottom) into and empty ArcMap layout. Add the imported DEM from the mystery7 folder so it is on top of sbdem. To make the display intelligible you will have to change the Display properties for the layers; change the transparency of sbtin so that the DEM rasters can be seen underneath it.

| Question: 8. Which of the three layers (sbdem, sbtin, sbcontour) do you think was the original data layer? Which is "second generation" and which is "third generation"? Why do you think this? |
5 - Data Models and ArcToolbox
Coverages were the data structure used in ArcInfo7 to store vector data. Many of the ArcToolbox tools use a wizard to interactively let you specify the parameters of commands, which are then used to create and execute a series of command line functions in the background. Many of the tools only support coverages, although some of the newer tools are designed for geodatabases.
To familiarize yourself with ArcToolbox and the input formats required, find each tool listed below and figure out what kind of input file(s) it supports (e.g., coverage, geodatabase feature class, grid, TIN, etc.)
| Question: 9. Find each of these tools and what data model type(s) (or perhaps other file types) it takes as input: a)
Clip, Select, Intersect, Buffer, & most other Analysis Tools (all
the same answer) |
As discussed above, coverages have been the standard vector data model for previous releases of Arc/INFO. With the release of ArcGIS8, all of the modules of Arc/INFO (Arc, ArcEdit, Grid, Tables, ArcPlot, INFO etc) were integrated, and the new geodatabase model was launched and promoted. However, coverages are still fairly common so we need to know something about their structure.
Recall that coverages employ the georelational
database model. The INFO part of Arc/INFO was the relational database manager
for earlier versions of the Arc software (Arc was the name given to the mapping
component). An INFO file is a table that stores the information associated with
the geographic features of a spatially referenced dataset. This gives a GIS
the ability to manipulate information both spatially and via standard tabular
database functions. An example relational model is when two tables share a common
column. In a georelational model the individual records in two or more tables
are related through their location in space. The polygon coverage below serves
as a simple example of this concept. The common column is often called the KEY
(of ID) column and is used to relate the tables and features.

Figure 4: Diagram showing the coverage data structure for storing vector data.
Let's explore an attribute table that is part of the roads coverage. Go to ArcCatalog and Preview the data (Change the preview option in ArcCatalog from Geography to Table)
| Question: 10.1 How many records are there? 10.2 What do FNODE# and TNODE# mean? 10.3 What other attribute information can you recognize or guess at in the table (pick 3 columns)? |
For a look at polygons and Polygon Attribute
Tables (PATs), open cacounty.
| Question: 11.1 How many counties are there in California? 11.2 Why do the AAT, PAT, and RAT have different numbers of records? 11.3 Explain the relationship between arc, polygon, and region.ctyin this coverage. 11.4 What are the label and tic feature classes for? Hints:
To figure out the answers, you will need to examine several of the tables.
In addition, you might want to use the Identify Tool in the Geography
Preview. Read the Help. |
Lab 3.1 - Data Models Write-up
The End