Rainfall Interpolation for Santa Barbara County


An original GIS project by Rachel T. Nguyen, Dylan Prentiss, and Jennifer E. Shively at the UCSB Department of Geography. This project was developed in concert with Geography 176C, under the instruction of Violet Gray, during the Spring Quarter 1998. This presentation was given on June 4, 1998.



Overview


This analysis was centered on looking at two interpolation methods for predicting rainfall in Santa Barbara County Proper, i.e., not including the Channel Islands. The methods initially chosen were multiple regression (via map algebra) and kriging (via a GIS package). Kriging requires analysis of a semi-variogram to ascertain which function is necessary to fit the data. After a semi-variogram was produced, no definite form could be recognized, and it was concluded that kriging would be unsuitable for this data. At this point, Inverse Distance Weighting (IDW) was chosen for comparison to multiple regression, since no such initial condition is required.

In addition, two separate rainfall seasons were selected for both interpolation methods. An "El Niño" season (1982-83) and an anomalously dry season (1989-90) were chosen for contrast. This allows for two initial hypotheses:

1. Multiple Regression analysis is a better predictor of rainfall than IDW.

i.e., the relative residual error term, (X), for multiple regression, is less than that for IDW, where:

X = e/(observed),

and,

e = (observed-expected).


2. Rainfall interpolation is generally better for wet seasons than for dry seasons.

i.e., the relative residual error term (X), for 1982-83, is less than that for 1989-90.



Data Description and Collection


Rainfall data was collected from the Santa Barbara Flood Control District for both seasons, 1982-83 and 1989-90.

Rainfall was reported in inches.

Gauge station locations were reported by latitude and longitude, to the thousandth of a degree.






Method of Data Analysis



Data Selection

A total of 60 rainfall gauge sites were selected throughout the county. These sites reported rainfall data for the two seasons in question.

Three factors are assumed (for the purpose of simplicity) regarding precipitation generation at all points in Santa Barbara County for this analysis; rainfall is a function of:
  1. Distance from the west-facing coast
  2. Distance from the south-facing coast
  3. Elevation




Semi-Variogram for Kriging


Here, gamma equals half of the sample variance. It can be seen that there exists no easily recognizable shape to the curve for either year; neither a punctate nor universal form would be suitable for kriging with this data. Obviously, this is due to the large variations in rainfall between sample points that may or may not be due to extremes in elevations between these points, or due to measurement error in collection of the rainfall data itself. Thus, it was concluded that kriging should not be performed on these data sets.



Data, Grid and Coverage Preparation

Data Model: Vector
Data Model: Raster



Data Analysis

Multiple Regression

Cross-validation of the multiple regression model is straightforward and easily calculated by a spreadsheet program. Distance of a gauge station point from the west and south coasts, and elevation of the point, is entered into the corresponding regression equation for the appropriate year. The returned value is the predicted rainfall at that point.
Regression grids of each of the seasons are shown below. (The pound signs represent one of the sub-sampled gauges, Santa Barbara Sewage Station).






Inverse Distance Weighting (IDW)

Cross-validation of the IDW model is moderately complicated and time-consuming, but necessary. With the 60 sample points, a point is removed, and an IDW interpolation of the remaining 59 station points is calculated. From here, the removed point's rainfall value is returned by obtaining a cell value at that location. Ideally, this would be repeated for each of the remaining 59 points, for both seasons, resulting in 120 individual grids. However, due to limitations in computer time and storage, a random sub-sample of 20 points from the 60 were analyzed. Thus, a total of 40 different IDW interpolations were produced. Regression values of the same 20 points were calculated. A second-order, radial IDW was performed where the radius = 25.0 km.

General IDW grids for each season are shown below. NOTE: These are NOT the exact grids in which values were derived for the 20 points. Each individual, interpolated grid will appear slightly different.





Results of Data Analysis


Statistical analyses were performed by looking at the values returned for each of the 20 sub-sample stations (for each season).

For these stations, the relative residuals were calculated. Furthermore, the average relative residual (X) for the four cases were derived:



It can be seen that the average relative residual error term, X, for multiple regression is less than that for IDW for each corresponding season:

XMR  <   XIDW

And that X for 1982-83 is less than that for 1989-90 for each interpolation method:

X1982-83  <   X1989-90




Conclusion

Confident predictors return near-zero, relative residual values for each input. The X values can also be regarded as percentage error off the observed value when multiplied by 100.

Therefore, this data, analysis, and procedure suggest that:

The multiple regression method appears to interpolate rainfall values better than the Inverse Distance Weighting method (with the specified parameters), and;

Spatial interpolation of rainfall in Santa Barbara County proper appears to work better for significantly wetter winter seasons than significantly drier ones.



Return to Homepage