———. 2007. “A Systems View of the Modern Grid.” Available at http://www.netl.doe.gov/ smartgrid/referenceshelf/whitepapers/ASystemsViewoftheModernGrid_Final_v2_0.pdf (accessed May 2012).

Pacific Northwest National Laboratory (PNNL). 2009. “Grid Friendly Controller Helps Balance Energy Supply and Demand.” Available at http://readthis.pnl.gov/MarketSource/ReadThis/B3099_ not_print_quality.pdf (accessed May 2012).

Roncero, Javier Rodríguez. 2008. “Integration Is Key to Smart Grid Management.” Paper presented at CIRED Seminar, Institute of Engineering and Technology CIRED, June 23–24. Available at http://webpages.uncc.edu/~jmconrad/ECGR6185-2010-01/readings/Jebasingh_Integration%20 is%20key%20to%20smart%20grid%20management.pdf (accessed May 2012).

Sharma, Ravi. 2008. “Home Automation Comes of Age,” Information Quarterly 7 (3): 48–49.

Van Gerwen, Rob, Saskia Jaarsma, and Rob Wilhite. 2006. “Smart Metering, KEMA.” Available at http://www.leonardo-energy.org/sites/leonardo-energy/files/root/pdf/2006/SmartMetering.pdf (accessed March 2013).

288 Industrial Revolution SpAM SpAM (Spatial Analysis and Methods) presents short articles on the use of spatial statistical techniques for housing or urban development research. Through this department of Cityscape, the Office of Policy Development and Research introduces readers to the use of emerging spatial data analysis methods or techniques for measuring geographic relationships in research data. Researchers increasingly use these new techniques to enhance their understanding of urban patterns but often do not have access to short demonstration articles for applied guidance. If you have an idea for an article of no more than 3,000 words presenting an applied spatial data analysis method or technique, please send a one-paragraph abstract to ronald.e.wilson@hud.gov for review.

Changing Geographic Units and the Analytical Consequences: An Example of Simpson’s Paradox Ron Wilson U.S. Department of Housing and Urban Development University of Maryland, Baltimore County The views expressed in this article are those of the author and do not represent the official positions or policies of the Office of Policy Development and Research or the U.S. Department of Housing and Urban Development.

Foreclosures and Crime The rapidly degrading housing market of the mid-2000s caused local governments to be concerned about the multitude of problems foreclosures could wreak on their jurisdictions (Wilson and Paulsen, 2008). One concern was the escalation of crime and disorder in neighborhoods with concentrated foreclosures. Several researchers who examined the relationship between foreclosure and crime had conflicting results (Arnio and Baumer, 2012; Arnio, Baumer, and Wolff, 2012; Baumer, Wolff, and Arnio, 2012; Cui, 2010; Ellen, Lacoe, and Sharygin, 2011; Goodstein and Lee, 2010;

Immergluck and Smith, 2006; Jones and Pridemore, 2012; Katz, Wallace, and Hedberg, 2011;

Kirk and Hyra, 2012; Stucky, Ottensmann, and Payton, 2012; Wallace, Hedberg, and Katz, 2012).

The assortment of geographic units used in these studies is extensive, consisting of property locations, block faces, census block groups, census tracts, customized local geographies, grid cells, cities, counties, and metropolitan statistical areas. The variety of factors, constructs, and variables the researchers used in these studies certainly contributed to their conflicting results, but the range of geographies likely played a role in the outcome differences, because the underlying data were aggregated to different geographic scales.

Cityscape 289 Cityscape: A Journal of Policy Development and Research • Volume 15, Number 2 • 2013 U.S. Department of Housing and Urban Development • Office of Policy Development and Research Wilson Conflicting results are common in social science research from the use of different geographic units of analysis (Coulton et al., 2001; Hipp, 2007; Macintyre, Ellaway, and Cummins, 2002;

Rengert and Lockwood, 2009; Taylor, 2012). None of the cited studies, though, included tests of the foreclosure and crime relationship with multiple geographic units to gauge the effect on results. I illustrate in this article how changing geographic units can produce converse results with an example of foreclosure and crime modeling drawn from Wilson and Behlendorf (2013). I also conduct a spatial analysis to identify which geographic unit is best for modeling foreclosures and crime in the Wilson and Behlendorf (2013) example, using several spatial analysis techniques.

The central finding from Wilson and Behlendorf (2013) was that the rate of foreclosures had a positive and significant association with crime increases in 2006 and 2007, but results differed between geographic units. The full output for the two geographies is shown in exhibits 1 (tracts) and 2 (block groups), but I focus on the residential instability factor and the spatial lag variable for the remainder of this analysis.

Crimes of (1) violence, (2) property, (3) residential burglary, and (4) minor property damage.

Crime data were supplied by the Charlotte-Mecklenburg Police Department; followed the Uniform Crime Report, or UCR, classifications; and were geocoded to the specific street of occurrence.

The Department of Geography and Earth Sciences at the University of North Carolina at Charlotte provided parcel data.

I identified foreclosed properties where the title transfer date indicated bank repossession ending in involuntary vacancy.

The residential stability coefficients changed dramatically between tracts and block groups. Residential stability represents the level of social connections between neighborhood residents. Stable neighborhoods have a constancy of residents who remain in their homes over long periods of time and they know, trust, like, and communicate with their neighbors. Residential stability degrades when residents leave and new ones move into a neighborhood—that is, turnover—and social bonds are broken. Crime can increase if residential turnover is frequent, because social connections are strained and neighbors do not trust or know each other (Garcia, Taylor, and Lawton, 2007; Shaw and McKay, 1942). The residential stability factor was constructed as a scale centered on 0 and includes the percentage of (1) residents who are 5 years of age and older who lived in the same house 5 years earlier, (2) owner-occupied homes, and (3) single-family and multifamily housing units.

The scale was reverse coded to represent instability with positive numbers and stability with negative numbers.

Exhibit 1 shows that residential instability is significantly associated with all crime constructs in both 2006 and 2007 for block groups, but only for violence with tracts (exhibit 2). Not only did the signs change from negative to positive between geographies, but also the coefficients remained statistically significant with large effects. With tracts, the interpretation is that crime decreased as residential instability increased, but, with block groups, the converse was true in that crime increased in less residentially stable neighborhoods; the latter scenario is theoretically expected.

This sign switching between geographies is indicative of local spatial patterns being lost with the use of larger geographic units—the significance change of the spatial lag coefficient between the two geographies highlights this point.

The spatial lag variable represents measures of similarity and dissimilarity with nearby geographic units for a foreclosure contagion effect. The significance level of the spatial lag variable means a spatial effect is present in the relationship and should be modeled. Ignoring the spatial effect can bias parameter estimates and significance levels (Anselin et al., 2000), because existence of a spatial effect is an artifact of the measured relationships. The spatial lag coefficient is significant for block groups for most crime constructs across both years, but it is not significant for tracts. Tract results suggest no spatial contagion effect exists for foreclosures in relation to crime. This finding indicates the inability of census tracts to capture an existing spatial relationship between foreclosures and crime.

Conflicting Results and Simpson’s Paradox Coefficient sign reversals, especially when they remain statistically significant, can indicate model misspecification. Sign reversals can also occur when different geographic units are used, however, because the change alters data distribution patterns. Known as Simpson’s Paradox, the repartitioning of the underlying data from smaller to larger geographic units can cancel out or reverse patterns in smaller units. The paradox is a consequence of the modifiable areal unit problem (MAUP)4 in which statistical results are affected by modifications to the geographic unit’s boundary size and/or shape. Aggregated data are uniquely partitioned by their geography and, when geographic units are For an indepth technical discussion of MAUP, see Openshaw (1994).

changed, the new boundary sizes and shapes are repartitioned. Exhibit 3 depicts how Simpson’s Paradox occurs between census block groups and tracts for residential stability and 2006 violent crime counts for Charlotte and Mecklenburg County.

Exhibit 3 shows Simpson’s Paradox with the trend lines in the two scatter plots being the inverse of each other with the data clouds practically being mirror images of each other. Exhibit 3a shows the block group data pattern and is interpreted as—when residential instability increases, crime also increases. The opposite pattern occurs for tracts (exhibit 3b) and is interpreted as—when crime decreases, residential instability increases.

To show Simpson’s Paradox geographically, I converted the violent crime and residential instability values into z-scores to categorize their relationship as similar5 and dissimilar.6 Exhibits 4 (block groups) and 5 (tracts) exemplify Simpson’s Paradox more visually by displaying the similar and dissimilar categories. Stark geographic pattern changes occur for the violent crime and residential instability relationship between the geographic units.

The exhibits show several areas across the county that change from similar levels of violent crime and residential instability to dissimilar levels when switching from block groups to tracts. For example, the Providence and Independence Divisions (patrol divisions of the Charlotte-Mecklenburg Police Department jurisdiction) south of city center, show near complete reversal patterns. The Exhibit 4

Violent crime and residential instability values both have positive or negative z-scores.

Violent crime and residential instability values have conflicting positive and negative z-scores.

Providence Division primarily contained low levels of violent crime with high residential instability at the block group level, indicating dissimilarity between the two variables. With census tracts, the violent crime and residential instability relationship changed from similarity to dissimilarity. In the Independence Division, the opposite was true, in that a similarity existed between high and low levels of violent crime and residential instability within block groups, but changed to a dissimilar relationship within tracts.

Exhibit 6 shows the violent crime and residential instability scatter plots for the Independence and Providence Divisions. The scatter plots show the same overall trends of violence and residential instability as exhibit 3 shows for the county, but they help explain the pattern changes between exhibits 4 and 5. For example, 30 of the 43 (69 percent) block groups within the Providence Division have high residential instability with low violent crime, but 12 of 15 tracts (80 percent) now have low violence and low residential instability categories. The similarity and dissimilarity categories were altered as the x and y axes in the scatter plots shifted significantly to contain different observations. Exhibits 4 and 5 also show that category changes significantly alter the trends across the county.

Simpson’s Paradox prompts a dilemma in deciding which geographic unit to use for further analysis. Theoretical or expected results could guide the selection of geography, but they may not solve

the paradox. In the next section, I demonstrate how to identify which geographic unit is more appropriate for measuring the relationship of foreclosures and crime in Wilson and Behlendorf (2013).

Examining Local Data To Identify the Spatial Extent of Foreclosures An important aspect in Wilson and Behlendorf (2013) was the inclusion of the spatial lag (autocorrelation) measure of foreclosures to test for a contagion effect. The spatial lag coefficient in

exhibits 1 and 2 was significant for block groups, but not for tracts. If a spatial contagion effect exits amongst foreclosures, then it is important to identify which geographic unit would best capture the effect, because crimes related to those properties would occur at a similar scale. I used several spatial analysis techniques to measure foreclosure concentration and to determine which geographic unit would be better to model with crime-related factors.

I first conducted a nearest neighbor analysis on foreclosed parcels from 2003 to 20087 to obtain the average distance between the properties. The nearest neighbor index was 0.3835 (z = -136.92), which indicates a strong clustering pattern. The average distance between foreclosed properties is

264.3 feet, with a standard distance of 374.8 feet. These two results indicate that foreclosures were very close to each other and often on the same or adjacent streets.

I used date ranges beyond our focus years to ensure capture of the long-term distribution patterns and reduce the likelihood of any anomalous cluster patterns that might have occurred at the peak of the housing crisis.

