«Moving to opportunity voluMe 14, nuMber 2 • 2012 U.S. Department of Housing and Urban Development | Office of Policy Development and Research ...»
The weekly newsletter shared production statistics for the overall project with the data collection staff, and the weekly team calls reviewed team production and efficiency statistics. The NBER research team was fully engaged in monitoring production and worked closely with the ISR project managers and field production managers to review progress, set priorities, identify issues, and develop and implement solutions to problems as they arose.
Survey Data Collection Costs The ISR project managers and NBER research team spent a great deal of time discussing response rate goals and the options for end-game strategies to reach the NBER research team’s ambitious goal of 89-to-90-percent ERRs while staying within budget. These debates often took the form of assessing tradeoffs between the high per-interview cost of achieving the last few response rate
Cityscape 69Gebler, Gennetian, Hudson, Ward, and Sciandra
percentage points and the benefit to the MTO long-term survey of decreasing nonresponse bias in
MTO’s estimated effects. When looking at costs, ISR project managers focused on two components:
(1) the variable costs associated with obtaining each interview (interviewer hours and nonsalary charges such as mileage, respondent incentive payments, and so on) and (2) fixed costs (for example, for the ISR central office staff to support data collection, project management, creating and checking statistical reports, writing progress reports and weekly memos) that were less dependent on the number of interviews being completed each week. Although the central office staff size decreased along with the field staff size as the study neared completion, the fixed cost per interview increased at the end of the study because those costs were amortized across fewer and fewer completed interviews.
Exhibits 2 and 3 show relatively distinct fluctuation points when survey data collection costs escalated. Exhibit 2 presents average costs per interview throughout MTO survey fielding. Note that these costs are not necessarily real-time costs, because of the delay between the actual interviews and when those costs were entered from an accounting perspective. Nonetheless, average costs per interview were quite steady at about $470 from July 2008 through July 2009, at which point most of the fresh sample from all three releases had been worked relatively thoroughly. Average interview costs increased substantially in the fall of 2009, to $802 per interview from August through October 2009 and to $1,076 per interview for November and December 2009. These costs are not surprising in light of the anticipated work it would take to complete interviews with
the hardest-to-locate cases. The two-stage subsampling strategy was in place for each site at this point. Average monthly costs per interview escalated starting in December 2009 to about $1,600.
Exhibit 3 presents an alternative but key metric of costs: total interviewer hours and hours spent by travel interviewers (that is, interviewers who were shared across sites or called on duty to travel to zones outside of the immediate area of the five initial MTO sites) and trackers (that is, those ISR staff who located sample using a variety of web, in-person, and alternative techniques, as described previously). Notably, December 2009 represented a key point at which ISR project managers, field supervisors of the data collection staff, and the NBER research team very seriously evaluated the costs and benefits of continuing survey data collection. October 2009 represents another cost flex point, at which time ISR had implemented the two-stage subsampling strategy at many of the sites.
As previously mentioned, the team had to balance meeting the high effective overall response rate with creating a completed survey sample that was balanced by site and by treatment status. Exhibit 4 shows how costs and survey completion rates varied by site. Baltimore had an accelerated survey interview completion rate in terms of reaching ERR targets, whereas Boston and Los Angeles had slower completion rates, in part because of field staff constraints. This exhibit exemplifies the site-based balancing act that ISR project managers and field supervisors of the data collection staff considered in attempting to meet the overall ERR target. This tension in approaches to meet an overall high ERR target, whether via triaging by working the hardest sample for each site or via focusing efforts on one site at a time, was a key balancing act that, as we will discuss in the following section, influenced the study’s main findings.
MTO’s Effects Under Varying ERRs The average cost per interview from the beginning of survey fielding (June 2008) through October 2009 was about $500. As mentioned previously, average interview costs jumped substantially thereafter: to almost $1,100 in November and December 2009 and to more than $1,600 from December 2009 through April 2010, when survey fielding ended. The adult survey ERR increased by 5.3 percentage points (285 interviews) between October 2009 and December 2009 and by 3.2 percentage points (97 interviews) between December 2009 and April 2010. This increase roughly translates to $58,000 per 1-percentage-point gain in ERR (calculated as the cost per interview plus the number of interviews completed) between October and December 2009 and $49,000 per 1-percentage-point gain in ERR between December 2009 and April 2010. Did the additional dollars spent toward gaining 1 extra percentage point in the ERR measurably or qualitatively alter the main conclusions in the final impacts evaluation (Sanbonmatsu et al., 2011)? We do not conduct a formal cost-benefit analysis but rather employ a simple back-of-the-envelope comparison of survey data collection costs at various time points during survey fielding and with a small number of metrics to attempt to capture the benefit via the MTO demonstration’s contribution to research and policy.
72 Moving to Opportunity Achieving MTO’s High Effective Response Rates: Strategies and Tradeoffs For our back-of-the-envelope analysis, we focused on two representative thresholds that align with observable fluctuations in survey data collection costs: October 2009, when the survey achieved an 81-percent ERR for adults (80 percent for youth), and December 2009, when the survey achieved an 86-percent ERR for adults (85 percent for youth). These ERRs, and the associated dates when they were achieved, also have general appeal. First, we did not want to confound our analyses with the cost efficiencies gained through the two-stage subsampling strategy to allocate more resources per a randomly selected, hard-to-locate case that was triggered at roughly a 75 percent ERR. Second, the ERRs at these cut points generally represent the range of ERRs normally achieved in a wide variety of survey data collection efforts (80 to 90 percent). We use these cut points as simulated dates at which survey fielding ended to construct new samples to reestimate MTO’s effects and to examine whether a qualitative difference emerged in three factors: the size of the intention-to-treat (ITT) estimate, the precision of that estimate, and the depiction of the control group. These proposed metrics are of scientific and policy interest; that is, they help inform the following questions:
Would our description of the status of the sample have changed had we ended the survey fielding period early? Would our confidence of MTO’s effect have changed? Would our interpretation of the program or policy influence on the outcome of interest have changed?
We reanalyzed MTO’s effects (for more explanation about the ITT and treatment-on-the-treated [TOT] estimates, see Gennetian et al., 2012; Ludwig, 2012; and Sanbonmatsu et al., 2012) under varying ERR scenarios—those mapped with the December 2009 and October 2009 cut points—in the following manner. First, we replicated the MTO ITT and TOT results for the outcome of interest for the completed MTO long-term survey. Recall that the final ERRs were 90 percent for adults and 89 percent for youth. We then compared the full-sample ITT and TOT estimates with ITT and TOT estimates reanalyzed using the following strategies: (1) using data from the completed pooled sample as of December 31, 2009, reflecting an overall 86-percent ERR for adults (85 percent for youth), and (2) using data from the completed overall sample as of either December 31, 2009 (when some sites, such as Baltimore, achieved something greater than 86-percent ERR), or the date at which the site achieved the equivalent ERR of 86 percent for adults (85 percent for youth). Strategy 2 recognizes the heterogeneity in ERR completion rates by site, whereas strategy 1 is relatively agnostic about site and instead focuses on the pooled ERR. The ERR target for the MTO long-term study was set for the entire MTO survey sample, with an important but secondary target to have a relatively representative sample from each site. In reality, ISR’s site-based field staff strategy, coupled with other factors—such as difficulty recruiting or retaining interviewers and the relative ease of finding sample for geographic or comparable reasons—meant that some sites achieved ERR targets faster than others. The variation in site-based survey data completion rates also implies that the date of the last completed interview will vary by site.7 We replicated strategies 1 and 2 under a Additional analyses that created a sample based on these ERR targets within treatment or control group did not uncover qualitative differences from the final sample results. This result was expected, in part, because by construction, the NBER research team and ISR project managers carefully monitored temporal balance by treatment or control group; that is, that roughly equivalent interviews were being completed for experimental, Section 8, and control group members in any one week or month and, if that was not the case, adjustments were made in real time to achieve this balance by flagging and prioritizing work on selected respondents.
slightly altered assumption of stopping survey fielding as of October 31, 2009, reflecting an overall 81-percent ERR for adults (80 percent for youth). Note in this latter case that Boston, New York, and Los Angeles were just shy of achieving the 81-percent ERR by October 2009.
Exhibits 5, 6, and 7 visually present results for a few of the outcomes and are the focus of our discussion. (Exhibits 8, 9, and 10 at the end of the article provide more detail.) The results shown in the exhibits suggest that the final MTO findings would have differed qualitatively for some of the important outcomes if survey fielding had stopped earlier. For example, if survey fielding had stopped at 81 percent for adults—either pooled or by site—the NBER research team would have falsely concluded that MTO had no effect on two of the four health outcomes. Of the four survey outcome measures for female youth, we would have falsely concluded that MTO had no effect on female youth mental health and that MTO increased female youth idleness (neither employed nor in school).
When examining MTO’s effects on neighborhood poverty, exhibit 5 (and the first outcome in the more detailed exhibit 8) suggests little qualitative difference in MTO’s effects across the various ERR assumptions, either through the size of the effect, the precision of the effect, or the description of the control group. Turning to MTO effects on other outcomes, exhibits 6 and 7 illustrate a slightly different pattern of results. MTO effects on adult psychological distress are very slightly larger (that is, larger reductions in psychological distress) at an 86-percent ERR. The differences are magnified when comparing MTO’s effects for the final sample (90-percent ERR) with an 81-percent ERR.
The difference between the analyses at the 86-percent ERR and at the 81-percent ERR is especially pronounced for the within-site ERR adjustments; if survey fielding had stopped when each site reached an 81-percent ERR, MTO’s effect on adult psychological distress, at -0.128, would have been 21 percent larger than the effect for the final sample, at -0.107. On the other hand, if fielding had stopped at an overall ERR of 81 percent, with variation in ERR by site, MTO effects would have been qualitatively very similar to those estimated at the final 90-percent ERR. Exhibit 7 (and the fifth outcome in the more detailed exhibit 9) suggests a similar, yet even more striking, pattern for female youth: MTO’s effects on female youth psychological distress, at -0.116 for the full sample (89-percent ERR), is qualitatively larger and more precisely measured than estimates measured at overall ERRs of 85 percent (-0.084 ITT) or 80 percent (-0.050 ITT).
Thus, taking the female youth psychological distress outcome as a starting place, the roughly $460,000 expended to achieve the last 8.7 percentage points in ERR (between November 2009 and April 2010) translated to a 43-percent difference in the effect estimate.
Discussion and Conclusion Several strategies contributed to achieving the high response rate goals set for the MTO long-term survey, including selecting and training a data collection team that was well equipped to work in a challenging environment and having staff who understood (and were motivated by) the importance of the MTO demonstration. Starting with a small team and bringing on additional staff after the demonstration started, although not in the original plan, turned out to be very beneficial to a demonstration as complex and difficult as MTO. The close collaboration between the ISR and NBER teams, effective communication between and across the ISR data collection staff, and a solid management structure were also keys to the success of the field effort.
– 0.080 – 0.078 – 0.078 – 0.080 – 0.081 – 0.08 – 0.089 – 0.089 – 0.09 – 0.091 – 0.091 – 0.092 – 0.10 – 0.100 – 0.100 – 0.102 – 0.102 – 0.103
– 0.006 – 0.008 0.00 – 0.017 – 0.025 – 0.027 – 0.028 – 0.046 – 0.047 – 0.041 – 0.030† – 0.05 – 0.045 – 0.048 – 0.098 – 0.070 – 0.080 – 0.107 – 0.086 – 0.085 – 0.10 – 0.088 – 0.128 – 0.110 – 0.108 – 0.15 – 0.179 – 0.20 – 0.189 – 0.190 – 0.192 – 0.209