Calibrating a cellular automaton model of urban growth in a timely manner.

Calibrating a cellular automaton model of urban growth in a timely manner

GIS/EM4

Jeannette Candau

Abstract

Previous research developed a model of urban growth using a self-modifying cellular automaton (CA) that can be calibrated with historical data layers as controls. The growth model is calibrated by predicting the present urban extent from the past. Calibration seeks to identify the best initial values of CA coefficients for a given data set. Although highly effective, the chosen method of calibration requires numerous data layers and is computationally intensive, possibly taking weeks of CPU time to do an entire calibration for a study region. In previous model applications, it has been assumed that the more historical data layers used, the more accurate the calibration. This research challenges that assumption and questions the importance of temporal intervals in the calibration data. Multiple calibrations were completed by varying the number, temporal interval, and temporal extent of the urban calibration data. Also, the effectiveness of long-term versus short-term data for short-term prediction is tested. Lessons learned from these calibrations are of use not only for future applications of this model, but in the more general context of urban modeling model calibration.

Keywords

Urban growth modeling, dynamic systems modeling, computational modeling, cellular automata, Urban Dynamics, calibration, urban growth, complex systems.

Introduction

Cellular automata have proven a useful tool for modeling dynamic urban systems (White and Engelen 1994; Papini and Rabino 1997) yet literature regarding the calibration of these models remains sparse. Mathian et al., (2000) point out that various techniques of modeling dynamic spatial processes exist, but methods of calibration have lagged behind in development. In our previous research, a model of urban growth using a self-modifying cellular automaton (CA), that can be calibrated with historical data layers as controls (Clarke 1996), was developed. Clarke, et al. (1996) described a methodology for rigorous calibration of this urban growth model (UGM) by way of visualization, as well as robust testing of a set of alternative coefficient solutions. In previous UGM applications it has been assumed that the more historical data layers used, the better the calibration. This research, sponsored by the U.S. Geological Survey (USGS) Urban Dynamics project, challenges that assumption, and questions the importance of temporal sensitivity of calibration data.

Although highly effective, this method of calibration requires numerous data layers and is computationally intensive, possibly taking weeks of CPU time for an entire calibration. In this study, the temporal sensitivity of the urban growth model was tested in an effort to discover a relation between urban model calibration and the number of input-data time steps and their spacing. Also, the effectiveness of long-term versus short-term data for short-term prediction was tested. Lessons learned from these calibration tests may be of use not only for future applications of the CA model, but in the more general context of urban modeling model calibration.

The urban growth model

The UGM is a complex systems model, so beginning at the earliest data time period, growth commences with an initial set of conditions for the study region. The data-input component of these conditions consists of urban extent at different time periods, a transportation network, topographic slope, area that is resistant to urbanization (referred to as "excluded"), and a hillshade layer that is used for visualization purposes. Five coefficients-dispersion (diffusion in previous literature), breed, spread, slope resistance and road gravity--influence the behavioral element of the initial conditions (Clarke et al. 1997).

During a model run, one growth cycle is equivalent to 1 year. A growth cycle consists of four phases of growth. The first phase is spontaneous neighborhood growth (figure 1), which simulates the ability of urban settlements to develop anywhere on a landscape without any specific dependencies on existing infrastructure. The second phase, new spreading center creation (figure 2) simulates the tendency of some percentage of the new urban settlements to attract continued growth, while others remain isolated. Growth from urban edges and growth from urban infilling are the most common types of newly urban land. Such growth is represented in the third phase, organic growth (figure 3). The tendency of urbanization to be drawn toward and along lines of transportation is created by the fourth phase: road-influenced growth (figure 4). The five behavior-rule coefficients affect how these growth phases are executed. At the completion of a growth cycle the emergent rate of growth of the system as a whole is measured. If a critical high or critical low growth rate occurs, the coefficients may be adjusted (Clarke et al. 1997).

Santa Barbara study area

The Santa Barbara study area is located in coastal southern California about ninety miles northwest of Los Angeles. It extends from Rincon Point westward to Elwood. It is bounded by the Santa Ynez Mountains to the north and the Pacific Ocean to the south. The nature of the area in which "the mountains meet the sea" has encouraged human development to occur laterally between steep mountain slopes and coastal bluffs and beaches. Cities have grown on the more gently sloping areas in between the two along Highway 101, which runs centrally through the area. A large portion of the study area is part of the Los Padres National Forest and protected from urban development. Historically, the issue of available water put a cap on urban development. However, in the last decade Santa Barbara County communities passed a measure to bring state water to the area. These actions have made the issue of water use in this semi-arid, desert environment seemingly irrelevant. The proximity of Los Angeles and Ventura is putting new and increased pressure for urban growth on the Santa Barbara area. Both commuter residents, and industries seeking to avoid the congestion of Los Angeles and the San Francisco Bay Area are considering the Santa Barbara area as a viable option. As the Los Angeles megalopolis continues to grow, its edge seems ever closer to the once isolated Santa Barbara.

A temporal database was created for the study region using ESRI ArcView and ArcInfo Geographic Information Systems (GIS). The GIS was used to store, edit, and process historical data. Final grid data types created were: urban extent, roads, slope, excluded areas where urbanization cannot occur, and a hillshaded background that was used for visualization. Full size grid resolution was 1751 columns by 428 rows of 30 m pixels representing approximately 675 km2.

Urban data were collected for the years 1929, 1943, 1954, 1967, 1976, 1986, and 1997. Air photography flown in the area during the years indicated above was used as source data for urban extent. The photos were scanned and registered. All recognizable buildings were digitized in ArcView. These digitized coverages of structures were then converted to 30-m binary grids and clipped to the study area extent. The resulting grids were classified as urban (where built structures existed) or non-urban.

U.S. Census Bureau TIGER road data representing the year 1997 were downloaded via anonymous ftp from the census server (http://www.census.gov/geo/www/tiger/). Primary roads were selected and clipped to the study area extent. The 1997 road coverage served as a base map for creating historical roads layers. Hard copy road maps produced by the Automobile Association of America were sourced as ancillary data. The roads classified as "primary roads" or greater importance were selected and all others were deleted. In this way, registered transportation coverages for the years 1929, 1942, 1952, 1967, 1975, and 1986 were created. These were then converted to 30-m binary grids.

USGS 30-m digital elevation models (DEMs) of the Santa Barbara area were brought into ArcInfo, merged together and clipped to the study area. The elevation data were then transformed into percent slope values. The hillshade was derived from the same clipped DEM used for slope. This layer forms the backdrop for animated sequences of urban growth. All parks and government lands were selected from the Santa Barbara County parcel map and were excluded from modeled urban growth. These include the Los Padres National Forest and smaller, block size city parks. Additionally, the Pacific Ocean and a small portion of Ventura County that was included in the study area extent were removed from possible urbanization.

Rescaling Data

Calibration of the scale-independent UGM occurs in three phases and requires all input images to be rescaled to half and one quarter of their original resolution. A special C program, halfgif, was used to perform this task. In addition to creating half size image output, halfgif buffers the road data to twice its original size before the resolution is changed. This buffering helps preserve the linear continuity of the transportation network through the resizing process.

Rescaling created two new versions the Santa Barbara Study Area dataset.

	Full Resolution	Half Resolution	Quarter Resolution
Calibration Phase	Final	Fine	Coarse
Row x Col	428 x 1751	214 x 875	107 x 437
Pixel Count	749,428	187,250	46,759

Temporal Input Testing

The growth model was calibrated by predicting the present urban extent from the past, and using historical data layers as a test of how well a given set of coefficient values represent the data. Six measures were emphasized through all phases of calibration (Clarke 1997). The first (compare) is a final year comparison of the total number of urban pixels. For checks against all historical data, the Pearson product-moment correlation coefficient (r2) was used as a measure of fit between modeled and observed measures. These included: a score of modeled number of urban pixels compared to actual urban count (population_r2), a score of modeled urban edge count compared to actual the urban edge (edge_r2), a score of modeled urban clustering compared to known urban clustering (cluster_r2), a score of modeled average urban cluster size compared to known mean urban cluster size (mean_cluster_size_r2), and a shape index measurement of spatial fit between the model's growth and the known urban extent (lee_sallee).

To derive a statistical best fit requires a minimum of four urban data layers. A total of seven control years were gathered to calibrate the study area of Santa Barbara, California. Multiple calibrations were completed by varying the number, temporal interval, and temporal extent of the urban data. The first calibration (Cal1) used all seven data layers: 1929, 1943, 1954, 1967, 1976, 1986, and 1997. The second calibration (Cal2) ranged over the same time period, but applied a minimal number of control years with maximum temporal intervals: 1929, 1954, 1976, and 1997. The third calibration (Cal3) used only the four most recent data layers: 1967, 1976, 1986, and 1997. In this way three different data input scenarios were used to calibrate UGM for Santa Barbara.

Discussion/Conclusion

Results from this work will be presented at the Geographic Information Science and Enviromental Modeling Conference in Banff, Alberta.

References used

Clarke KC, Gaydos L, Hoppen S. 1996. A self-modifying cellular automaton model of historical urbanization in the San Francisco Bay area. Environment and Planning B 24: 247-261.

Clarke KC, Hoppen S, Gaydos L. 1996. Methods and techniques for rigorous calibration of a cellular automaton model of urban growth. Third International Conference/Workshop on Integrating GIS and Environmental Modeling; 1996 Jan 21-25; Santa Fe, New Mexico.

Papini L, Rabino G. 1997. Urban cellular automata: an evolutionalry prototype. In: Bandini S, and Mauri G. [editors]. ACRI �96. Proceedings of the second conference on cellular automata for research and industry; (pp. 1996 Oct 16-18; Milan, Italy. Berlin: Springer. p 147-157.

White R, Engelen G. 1993. Cellular automata and fractal urban form: a cellular modeling approach to the evolution of urban landuse patterns. Environment and Planning A 25: 1175-1199.

Author

Jeannette Candau, Physical Scientist, U. S. Geological Survey Graduate Student, Department of Geography University of California, Santa Barbara, 3611 Ellison Hall, Santa Barbara, CA 93106. Email:jcandau@usgs.gov, Tel: +1-805-893-5178, Fax: +1-805-893-5178.