|
| |
Spatial Smoothing
Locally weighted regression smoothing ("loess") in two spatial dimensions can be considered to be analogous to linear regression using a single independent variable. For linear regression in a single variable, the straight line which best fits the data is found, where the best fit is determined by minimizing the sum of squares of the residuals. This produces a curve of a known functional form, y = a + bx, which is linear in the parameters a and b. In comparison, loess finds fitted values using a local regression technique. The fitted values produced by this process are a surface of the form z = a + f(latitude, longitude), where z is a transformation of y. Unlike linear regression, the shape of the fitted surface is not describable using a known functional form, i.e., it is a non-parametric surface. The transformation used to modify the loss costs is z = log(y/(1-y)). Since the MPCI loss costs are bounded between 0 and 1 and tend to be closer to the low end of the range, this transformation produces a less skewed dependent variable, which helps to improve the quality of the fit. In addition, the form of the transformation guarantees that the fitted loss costs will be non-negative.
The data used in the loess procedure is a single value for each location. In this example, the data is the average yield for Iowa corn by county over the period from 1981 through 1997. For ratemaking, the average loss costs over the experience period would be used instead. By using the average for each county, all intertemporal correlation is eliminated from the analysis. Even though the experience in adjacent counties may be correlated over time, this is not considered to be essential in estimating the expected values. Instead, spatial smoothing is only concerned with the spatial correlation of the data. The underlying concept is that the yields or loss costs change smoothly over space, and that knowledge of the yields in nearby counties provides redundant information which can be used to produce a better estimate of the expected value of the variable in each county.
|