«Urban Problems and sPatial methods VolUme 17, nUmber 1 • 2015 U.S. Department of Housing and Urban Development | Office of Policy Development and ...»
This general principle can be easily extended to the use of other ancillary datasets besides land use/cover, including road density data (Reibel and Bufalino, 2005), zoning- and parcel-level data (Maantay, Maroko, and Herrmann, 2007), address point data (Zanderbergen, 2011), unclassified remotely sensed spectral values (Holt, Lo, and Hodler, 2004), and scanned raster reference maps (Langford, 2007). In addition, as equation 1 indicates, the technique can accommodate not only two ancillary classes with DRc values of zero (uninhabited) and one (inhabited) but also multiple classes with values ranging continuously between zero and one, so that population can be allocated in a more complex manner than by simply distinguishing between inhabited and uninhabited regions.
Demonstration As a demonstration of the dasymetric mapping for small-area estimation of urban population, dasymetric mapping is used to estimate population data for the city of Philadelphia, mapped to the census tract level at sub-tract spatial units. Tract-level data and dasymetric mapping-level data are then compared in an analysis of the population at risk of air pollution to illustrate how dasymetric mapping can be used for urban analysis. Population data at the census block level are used as a validation dataset to compare the analytical results of the tract- and dasymetric-based measurements of population exposure. Note that this analysis is intended strictly for purposes of illustration of the dasymetric approach and is not intended to address issues of environmental exposure to air pollutants in Philadelphia, which would require a more thorough analysis.
Data and Implementation Total population derived from the 2010 Census at the tract and block levels were acquired from the Census Bureau American Factfinder website. These data include 384 tracts (18,872 blocks) with a total population of 1,526,006 (exhibit 2). Ancillary data related to population distribution in Philadelphia are necessary to facilitate the dasymetric mapping. For this purpose, zoning data for Philadelphia were acquired from the City of Philadelphia. These polygon data encode allowable land uses and building restrictions, coded as the following zoning classes: high-density residential, low-density residential, commercial/residential mixed use, commercial nonresidential, industrial nonresidential, parks and other related nonresidential land uses, and nonresidential transportation infrastructure. These classes were aggregated to reflect low-density residential, high-density residential (including commercial/residential mixed use), and nonresidential areas (exhibit 3).
Exhibit 2 Map of Population Density by Tract in Philadelphia
Map of Zoning Classes Used As Ancillary Data in the Dasymetric Mapping:
Nonresidential, High-Density Residential, and Low-Density Residential
GIS software was used to process the data, perform the dasymetric estimation, and implement equations 1 and 2. The tract layer and the ancillary zoning layer were intersected to produce a target layer composed of 30,271 polygons. Each term in equation 1 was then calculated and stored in a field in the target layer attribute table. The area ratio (AR f ) was calculated as the ratio of the target zone area to its host source zone (tract) area. The density ratio for the spatial ancillary zoning class data (DRc ) was set to values of 0.30 for low-density residential areas, 0.35 for both the highdensity residential areas and the commercial and residential mixed-use areas (typically downtown apartment buildings with stores located on the first floor), and 0.0 for the other nonresidential zoning classes. Although these values are acknowledged as being somewhat arbitrary, they reflect the exclusion of population from nonresidential areas of the city and also the greater concentration of population in high- versus low-density residential zoning classes.
The value for AR f DRc was calculated and encoded in another field in the target layer attribute table. Using a summarize function in the GIS, the sum of all target layer polygons’ AR f DRc value
was calculated for each individual tract, and these data were joined onto the target layer attribute table. The TFfc value—that is, the ratio of each target polygon’s AR f DRc to the sum of all AR f DRc in that target polygon’s host tract—was then calculated and stored in another field in the target layer’s attribute table. Finally, the population value for each target polygon was calculated according to equation 2 by multiplying the host tract population by the TFfc for each target polygon.
The resulting dasymetric map of population density is shown in exhibit 4. One can clearly see that the precision of population distribution is far higher in the dasymetric map as compared with the tract-level map. The tract-level map has only three small tracts that have no residential population, whereas the zoning map clearly shows large areas of the city that are nonresidential. The dasymetric map prohibited population from the areas zoned nonresidential and allocated the remaining population to the residentially zoned areas. Areas zoned high-density residential and commercial and residential were allocated population at a greater proportion as compared with low-density residential zoned land, after accounting for differences in the area of the target polygons.
Exhibit 4 Dasymetric Map of Population Density in Philadelphia
Calculating the Population Located Near Air Polluting Facilities As a way of illustrating the utility of the dasymetric mapping, a relatively simple analysis is conducted of the population located near facilities releasing pollutants to the air. Data on the locations of facilities releasing or disposing of more than 10,000 pounds of hazardous air pollutants on site in 2010 were acquired from the U.S. Environmental Protection Agency’s Toxic Release Inventory Program. Six facilities were mapped (exhibit 3) and population counts were tallied within a series of distances from the facilities, using the tract, block, and dasymetric data population maps. First, the total populations of the six tracts, blocks, and dasymetric polygons that contained the six facilities were calculated. Then, those tracts, blocks, and dasymetric polygons within 0.25 kilometer of a facility were selected, those within 0.5 kilometer of a facility were selected, and so on up to 1.0 kilometer from a facility.
Two prominent GIS-based methods of selecting polygons within a certain distance of a set of point features were employed (Mennis, 2003). The first method, referred to as the intersect method, selects all those polygons that overlap the distance buffer. So, for example, a tract for which any portion of the tract falls within the specified distance of a facility would be selected as being within that distance of that facility. The second method, called the centroid method, selects only those polygons whose geometric center falls within the buffer distance.
Exhibit 5 shows two graphs indicating the differences among the tract-level, block-level, and dasymetric-level population calculations, with total population shown on the y-axis for each measured distance from the each facility. The graph at the top shows the results for the intersect method of polygon selection and the bottom graph shows the results using the centroid method.
For both methods of selection, the tract-based calculations clearly tend to overestimate the total population nearby as compared with the dasymetric-based calculations. For the intersect method of selection, the difference between the tract data and the dasymetric data increases with increasing distance. At a distance of 1 kilometer, the tract-level population estimate is nearly three times the dasymetric-level estimate. This pattern occurs because the number of tracts under consideration increases substantially as the distance from a facility increases—because the tract data are a much coarser resolution than the dasymetric data, the area considered within any given distance of a facility using the intersect method is much larger using the tract data as compared with the dasymetric data.
For the centroid method, the maximum difference between the tract and dasymetric datasets is observed at distances nearest to the facilities. At a distance of 0.25 kilometer, the tract-level population estimate is nearly 10 times the dasymetric-level estimate. As the distance increases, the estimation of the total populations for the different datasets tends to converge at a distance of approximately 1 kilometer. The reason for this can be observed in exhibit 6, which shows a closeup view of the facility in the bolded box in exhibit 3, where the bolded circle and cross-hatch pattern show the area within 0.5 kilometer of the facility. The tract data on the left indicate that the facility lies nearly at the intersection of three separate tracts; thus, the calculation of the population at risk, using data derived from the host tract, is likely to be inherently inaccurate, because air pollutants would likely spread across tract boundaries. The dasymetric data on the right shows a far greater spatial variation in population distribution, where it is clear that most of the area within 0.5 kilometer of the facility is nonresidential. Indeed, the entire population within the 0.5-kilometer distance is concentrated in the southern portion of the buffer.
km = kilometer. w/in = within.
Exhibit 6 A Visual Comparison of the Tract (left) and Dasymetric (right) Population Data Within
0.5 Kilometer of the Air Polluting Facility Shown in the Box in Exhibit 3
Importantly, for both the intersect and centroid selection methods, the graph lines for the blocklevel data closely mirror those of the dasymetric data. Indeed, it is very clear visually that the tract-level data for both methods substantially overestimate the population within a given distance of a facility, whereas the dasymetric data provide a far more accurate depiction.
Conclusion This article outlines the general principles of dasymetric mapping and offers a demonstration of its efficacy in urban population analyses. The use of coarse resolution population data relative to the scale of analysis, or the use of population data aggregated to spatial units unrelated to the actual distribution of population, can result in inaccurate assessments of urban population exposure and accessibility. Dasymetric mapping offers substantial potential for improving estimates of population exposure and accessibility through the estimation of population at a much finer scale, through the integration of often publicly available ancillary data. In addition, the basic principles of dasymetric mapping are relatively easy to implement in many commercial and open-source GIS software packages.
More sophisticated approaches to dasymetric mapping that rely on regression, kriging, and iterative algorithms have also been developed (for example, Leyk, Nagle, and Buttenfield, 2013; Liu, Kyriakidis, and Goodchild, 2008). Research suggests, however, that, although the accuracy of dasymetric mapping is dependent on the nature of the algorithm and ancillary data source, even relatively simple efforts to incorporate ancillary data into dasymetric population estimation typically result in significant improvements in population estimations over areal weighting (Langford, 2013; Zanderbergen and Ignizio, 2010). Thus, urban analysts with even basic knowledge in GIS should be able to effectively implement and benefit from dasymetric mapping.1 Acknowledgments The author thanks the editors for their helpful comments in improving this article.
Author Jeremy Mennis is a professor in the Department of Geography and Urban Studies at Temple University.
The dasymetric mapping technique described here was implemented in two publicly available scripts for ArcGIS (Environmental Systems Research Institute, Inc.). The first was implemented by Rachel Sleeter at the U.S. Geological Survey (Sleeter and Gould, 2007) and is available at http://geography.wr.usgs.gov/science/dasymetric/. The second was implemented by Torrin Hultgren (Mennis and Hultgren, 2006) and is available at http://enviroatlas.epa.gov/ enviroatlas/Tools/Dasymetrics.html. Responsibility for the use and application of the dasymetric mapping scripts and their products lies with the user.
Bhaduri, Budhendra, Edward Bright, Phillip Coleman, and Marie L. Urban. 2007. “LandScan USA:
A High-Resolution Geospatial and Temporal Modeling Approach for Population Distribution and Dynamics,” GeoJournal 69 (1–2): 103–117.
Eicher, Cory L., and Cindy A. Brewer. 2001. “Dasymetric Mapping and Areal Interpolation:
Implementation and Evaluation,” Cartography and Geographic Information Science 28: 125–138.
Goodchild, Michael F., and Nina S. Lam. 1980. “Areal Interpolation: A Variant of the Traditional Spatial Problem,” Geo-Processing 1: 297–312.
Gregory, Ian N., and Paul S. Ell. 2005. “Breaking the Boundaries: Integrating 200 Years of the Census Using GIS,” Journal of the Royal Statistical Society, Series A 168: 419–437.
Holt, James B., C.P. Lo, and Thomas W. Hodler. 2004. “Dasymetric Estimation of Population Density and Areal Interpolation of Census Data,” Cartography and Geographic Information Science 31: 103–121.
Langford, Mitchel. 2013. “An Evaluation of Small Area Population Estimation Techniques Using Open Access Ancillary Data,” Geographical Analysis 45: 324–344.
———. 2007. “Rapid Facilitation of Dasymetric-Based Population Interpolation by Means of Raster Pixel Maps,” Computers, Environment and Urban Systems 31: 19–32.
Leyk, Stefan, Nicholas N. Nagle, and Barbara P. Buttenfield. 2013. “Maximum Entropy Dasymetric Modeling for Demographic Small Area Estimation,” Geographic Analysis 45: 285–306.
Liu, X.H., Phaedon C. Kyriakidis, and Michael F. Goodchild. 2008. “Population-Density Estimation Using Regression and Area-to-Point Residual Kriging,” International Journal of Geographical Information Science 22 (4): 321–447.