FREE ELECTRONIC LIBRARY - Theses, dissertations, documentation

Pages:   || 2 | 3 | 4 |

«MTR 0 4B00 000 17 MITR E TECHN IC AL R EPORT Confirmation Bias in Complex Analyses October 2004 Brant A. Cheikes† Mark J. Brown Paul E. Lehner ...»

-- [ Page 1 ] --

Approved for Public Release; Distribution Unlimited

Case #04-0985

MTR 0 4B00 000 17


Confirmation Bias in Complex Analyses

October 2004

Brant A. Cheikes†

Mark J. Brown

Paul E. Lehner

Leonard Adelman‡

Sponsor: MITRE Sponsored Research

Dept. No.: G062 Project No.: 51MSR114-A4

The views, opinions and/or findings contained in this report are those of Approved for public release; distribution unlimited.

The MITRE Corporation and should not be construed as an official Government position, policy, or decision, unless designated by other documentation.

©2004 The MITRE Corporation. All Rights Reserved.

Center for Integrated Intelligence Systems Bedford, Massachusetts † E-mail: bcheikes@mitre.org ‡ George Mason University Abstract Most research investigating the confirmation bias involves


experimental tasks where subjects draw inferences from just a few items of evidence. These tasks are not representative of complex analysis tasks characteristic of law enforcement investigations, financial analysis and intelligence analysis. This study examines the confirmation bias in a more complex analysis task and evaluates a recommended procedure, called Analysis of Competing Hypotheses (ACH), designed to mitigate confirmation bias. Results indicate that participant assessment of new evidence was significantly impacted by beliefs they held at the time evidence was received.

Evidence confirming current beliefs was given more “weight” than disconfirming evidence.

However, current beliefs did not influence the assessment of whether an evidence item was confirming or disconfirming. ACH did reduce confirmation bias, but the effect was limited to participants without professional analysis experience.

iii Table of Contents 1 Introduction 1 2 Method 5

2.1 Participants 5

2.2 Procedures 5 2.2.1 Procedures for ACH Group 5 2.2.2 Procedures for the Non-ACH Group 7

2.3 Information Manipulation 7 3 Results 9

3.1 Anchoring Effect 9

3.2 Confirmation Bias 10

–  –  –

v vi 1 Introduction Wickens and Hollands (2000, p. 312) define the confirmation bias as a tendency “for people to seek information and cues that confirm the tentatively held hypothesis or belief, and not seek (or discount) those that support an opposite conclusion or belief.” Klayman and Ha (1987) have correctly pointed out that “positive testing” may be the only strategy to obtain critical falsifications, for example, in cases where the hypothesis is not yet defined specifically enough to be falsified by one instance. However, the concern is that in cases where the hypotheses are well defined, the tendency for people to seek confirming information might result in “cognitive tunnel vision … in which operators fail to encode or process information that is contradictory to or inconsistent with the initially formulated hypothesis” (Wickens & Holland, p. 312). This “… may be dangerous because potential risks and warning signals may be overlooked and, thus, decision fiascos may be the consequence” (Jonas, Schulz-Hardt, Frey, & Thelen, 2001, p. 557). Indeed, anecdotal evidence (e.g., the Senate Intelligence Committee Report, 2004) suggests that recent intelligence analysis failures may be due in part to confirmation bias.1 The concept of a confirmation bias was introduced by Wason (1960), who used a “rule

identification task” such as the following (from Bazerman, 2002, p. 34):

Imagine that the sequence of three numbers (e.g., 2-4-6) follows a rule. Your task is to diagnose that rule by writing down another sequence of 3 numbers. Your instructor will tell you whether or not your sequence follows the correct rule.

The typical result in such tasks is that people tend to generate number sequences that are consistent with (or confirm) the rule that they think is the correct rule, such as 1-3-5 if one believes the rule is “numbers that go up by two.” People seldom generate sequences that try to disconfirm the rule, which in this study was “any three ascending numbers.” In addition to rule identification tasks, three other types of conceptual tasks have routinely been used to study the confirmation bias. One has been the “trait hypothesis-testing paradigm” (Galinsky & Moskowitz, 2000, p. 398). In this paradigm, participants are given a narrative describing a person and then asked to decide whether the person described in the narrative possesses one or more traits (e.g., self control) by either (a) selecting information that confirms or disconfirms the focal hypothesis, or (b) asking participants to remember previous information, some of which confirms or disconfirms the focal hypothesis. Another conceptual task is the “pseudo-diagnosticity” task (Evans, et al., 2002, p. 32), originally proposed by Doherty, Mynatt, Tweney, and Schiavo (1979). In this task, participants are typically asked to indicate which of three pieces of data about two hypotheses they would select to answer a particular question, with the typical finding being that they request data about the focal hypothesis (confirmation bias) because it seems (incorrectly) most diagnostic in answering the question. The last type of Although we note that such “anecdotal evidence” may itself be an instance of confirmation bias.

conceptual task is the “scientific inquiry” task (Koslowski & Maqueda, 1993, p. 105), proposed by Myatt, Doherty, and Tweney (1977). In this task, participants select a small number of science tests designed to generate data confirming or disconfirming their hypothesis.

Research with all four types of tasks has used minimal data (e.g., less than 10 data items) that did not vary in interpretability or reliability, and sometimes not even in diagnosticity. This level of artificiality raises concerns as to whether confirmation bias is characteristic of more complex analysis tasks where there is substantial evidence and the evidence items vary greatly in interpretability, reliability and diagnosticity.

One effort to experimentally investigate confirmation bias in a more representative setting was Tolcott, Marvin, and Lehner (1989), who worked with Army tactical intelligence analysts.

Working in teams of two, analysts were given an initial battlefield scenario and then asked to estimate the most likely avenue of approach (of three possible) for the enemy’s attack and their degree of confidence for it on a 0 to 100 scale. They were then given three rounds of incoming intelligence data. Each round contained 15 pieces of data, three supporting each of the two most likely avenues of approach, and nine being neutral. The analysts provided a new estimate and confidence level after each round. After the third round, they rated the degree to which each of the 45 pieces of intelligence data (presented during the three rounds) supported or contradicted the avenue of approach (hypothesis) they considered most likely, on a -2 to +2 scale. Tolcott et al.

(1989, p. 606) found that “Regardless of initial hypothesis, confidence was generally high and tended to increase as the situation evolved. Confirming evidence was sought, and weighted significantly higher than disconfirming evidence. Contradictory evidence was usually recognized as disconfirming, but was weighted lower than supportive evidence, was often regarded as neutral, and sometimes as deliberatively deceptive.” Consistent with Wickens & Hollands (2000, pp. 311we refer to the Tolcott et al. findings that participants did not change their confidence in the initial hypothesis (even given evidence inconsistent with it) as representing an anchoring effect (or heuristic) and the greater weighting of confirming evidence as representing the confirmation bias.

This paper describes an experiment (1) to replicate the Tolcott et al. result that confirmation bias is manifest in complex analysis tasks, and if so, (2) to determine whether a procedure recommended for use in the intelligence analysis community (Heuer, 1999; Jones, 1998) successfully mitigates it.

The first goal of the experiment was to see if we could replicate the Tolcott et al. findings.

Although confident, we were not certain that we would do so since (1) most confirmation-bias studies have used tasks that were conceptually different than the Tolcott et al. and (2) there are a number of studies that failed to obtain (or mitigate) the supposedly ubiquitous “confirmation bias” implied in introductory texts (e.g., Bazerman, 2002). For example, Ayton (1992) reviewed studies mitigating the confirmation bias for a rule identification task; Galinsky & Moskowitz (2000) did so for a “trait hypothesis testing” task; and Evans et al. (2002) did so for a “pseudo diagnosticity” task.

Moreover, it was not clear if the Tolcott et al. differential-weight findings were at odds with recent research on “predecision distortions” in jury decision making by Carlson and Russo (2001). The latter used the same approach as Tolcott et al. of (1) presenting initial background information and obtaining a confidence rating for the most likely hypothesis (in their case, between plaintiff and defendant), then (2) presenting rounds of new evidence (three in favor of the plaintiff and three the defendant), and (3) obtaining participants’ rating of the degree to which each evidence item supported the hypothesis (plaintiff or defendant). Carlson and Russo also found a significant relationship between participants’ initial confidence rating (“predecison”) and subsequent coding of new evidence (“distortion”). However, Carlson and Russo only measured distortion as the difference between a participant’s rating of evidence and an unbiased, mean rating of the evidence. Consequently, one cannot tell from their paper if participants’ initial confidence rating caused them to (a) completely reinterpret subsequent, disconfirming evidence (e.g., participants with a confidence rating favoring the plaintiff rated subsequent evidence favoring the defendant as actually favoring the plaintiff) or (b) simply gave the evidence a lower rating (e.g., one still favoring the defendant), thereby giving it less weight as Tolcott et al. found, before making their final decision. The current study distinguishes between evidence reinterpretation and weighting.

The second goal of the study was to test the effectiveness of a procedure, called Analysis of Competing Hypotheses (ACH), proposed by Heuer (1999) and Jones (1998) to minimize or eliminate characteristics of the confirmation bias. Although ACH has eight steps, their approach revolves around developing a “hypothesis testing matrix,” where the rows represent the evidence, the columns the hypotheses under consideration, and the cells the extent to which each piece of evidence is consistent or inconsistent with each hypothesis. The goals of the ACH matrix are to overcome the memory limitations affecting one’s ability to keep multiple data and hypotheses in mind, and to break the tendency to focus on developing a single coherent story for explaining the evidence—a tendency which Carlson & Russo (2001) hypothesized creates predecision distortions (and presumably the confirmation bias). ACH is hypothesized to offset confirmation bias by ensuring that analysts actively rate evidence against multiple hypotheses and reminding analysts to focus on disconfirming evidence. However, the only experiment testing the effectiveness of ACH found mixed results (Folker, 1999): it helped intelligence analysts identify the correct answer to one problem, but not another. (We note that both problems had fewer than 20 evidence items.) No experiment has directly tested ACH’s ability to mitigate the confirmation bias.

2 Method This section describes the participants, procedures, and information manipulation used to conduct the experiment.

2.1 Participants Twenty-four (24) employees of a large research and development corporation volunteered to participate in an experiment evaluating structured argumentation methods. Twenty were male, four female. All participants were interested in intelligence analysis, with 12 of the participants having intelligence analysis experience (ranging from 1 to 18 years, with a median of 9.5 years).

Participants’ ages ranged from 27 to 63, with a median of 47.50. Of the 23 participants who indicated their education, 22 had completed college, with 12 having a masters’ degree, three a Ph.D, and one an M.D. Sixteen of the participants majored in math, physics, computer science or engineering.

2.2 Procedures The entire experiment was conducted via email, and all data was collected within a two-month period. Participants were randomly assigned so that there were 12 in the ACH condition and 12 in the non-ACH condition. All participants started with an email providing a general description of what they would do and a request to complete all materials within one sitting, which was estimated to be (and was) two hours or less. The specific procedures for the ACH and non-ACH groups are described next.

Pages:   || 2 | 3 | 4 |

Similar works:

«Page 1 of 6 The evolution of the policy making process: will there ever be a community forestry bill? Pearmsak Makarabhirom Legislation to legitimize community forest management was once again stymied by the dissolution of the Thai Parliament in November 2000. Continuing differences on how to resolve several key issues, such as where communities can manage forest area and who should have final decision-making power, have stalled attempts to institutionalize community-based forest management....»

«EFFECTIVE DATE NUMBER MICHIGAN DEPARTMENT OF CORRECTIONS 09/26/2015 04.02.130 POLICY DIRECTIVE SUBJECT SUPERSEDES PRISONER STORE 04.02.130 (12/01/2013) AUTHORITY MCL 791.203, 791.204 PAGE OF POLICY STATEMENT: Prisoners in Correctional Facilities Administration (CFA) institutions (except in the SAI Program), and identified parolees at the Detroit Reentry Center (DRC) may purchase approved items for their personal use from the prisoner store vendor. RELATED POLICIES: 04.02.105 Prisoner Funds...»

«TOWARDS A REEVALUATION OF THE TONINÁ POLITY Eric TaladoirE Unité Mixte de Recherche, “Archéologie des Amériques”, Université Paris 1 Panthéon-Sorbonne AbstrAct: Among the numerous polities of the Usumacinta region, Toniná stands as the worst defined. In spite of its reduced population, Toniná developed an aggressive policy, and won several victories upon close-by and distant cities as well. This article tries, from the available archaeological and epigraphic data, to draw a more...»

«INTERNATIONAL MONETARY FUND Conditionality in Fund-Supported Programs—Policy Issues Prepared by the Policy Development and Review Department (In consultation with other departments) Approved by Jack Boorman February 16, 2001 Contents Pg I. Introduction II. Conditionality and Ownership: General Principles and Issues A. Purposes of Conditionality B. Principles of Program Monitoring C. Monitoring Tools D. Conditionality and Ownership III. Recent Experience with Conditionality IV. Streamlining...»

«Five Design Principles for Journal of Crowdsourced Policymaking: Social Assessing the Case of Crowdsourced Media for Off-Road Traffic Law in Finland Tanja Aitamurto, Organizations Hélène Landemore Volume 2, Number 1 Published by the MITRE Corporation Journal of Social Media for Organizations _ Five design principles for crowdsourced policymaking: Assessing the case of crowdsourced off-road traffic law in Finland Tanja Aitamurto, tanjaa@stanford.edu Hélène Landemore,...»

«SCALABLE ACCESS POLICY ADMINISTRATION (INVITED PAPER) Opinions and a Research Agenda Arnon Rosenthal The MITRE Corporation Abstract: The emerging world of large, loosely coupled information systems requires major changes to the way we approach security research. For many years, we have proposed construct after construct to enhance the power and scope of policy languages. Unfortunately, this focus has led to models whose complexity is unmanageable, to reinventing technologies that other...»

«DIRECTORATE GENERAL FOR INTERNAL POLICIES POLICY DEPARTMENT C: CITIZENS' RIGHTS AND CONSTITUTIONAL AFFAIRES CIVIL LIBERTIES, JUSTICE AND HOME AFFAIRS The EU Internal Security Strategy, The EU Policy Cycle and The Role of (AFSJ) Agencies Promise, Perils and Pre-requisites STUDY Abstract The present briefing note analyses and reflects on the EU policy cycle (within the broader context of the EU’s internal security strategy), with a focus on the role of European agencies and ongoing initiatives...»

«Oracle Insurance Policy Administration System Quality Assurance Testing Methodology An Oracle White Paper August 2008 Oracle Insurance Policy Administration System Quality Assurance Testing Methodology Introduction Summary of the Testing Lifecycle Plan Phase Test Plan Specification Phase Test Case Scenario Matrix Test Case Scripts Introduction Testing Environment Test Case and Script Automating Test Cases Execution Phase Executing the Test Case Script Documenting the Test Results Reporting...»

«2016 HANDBOOK OF IMF FACILITIES FOR LOWINCOME COUNTRIES March 2016 IMF staff regularly produces papers proposing new IMF policies, exploring options for reform, or reviewing existing IMF policies and operations. The Report prepared by IMF staff and completed on February 22, 2016 has been released. The staff report was issued to the Executive Board for information. The report was prepared by IMF staff. The views expressed in this paper are those of the IMF staff and do not necessarily represent...»

«Journal of Communication ISSN 0021-9916 RESEARCH ARTICLE The Contingency of the Mass Media’s Political Agenda Setting Power: Toward a Preliminary Theory Stefaan Walgrave1 & Peter Van Aelst2 1 Department of Political Science, Media, Movements and Politics (www.m2p.be), University of Antwerp, Belgium, B-2000 2 Department of Communication Science, Media, Movements and Politics (www.m2p.be), University of Antwerp, Belgium, B-2000 Recently the study of the relationship between the media and the...»

«X Congreso Internacional del CLAD sobre la Reforma del Estado y de la Administración Pública, Santiago, Chile, 18 21 Oct. 2005 Concepts and theories of horizontal policy management B. Guy Peters Department of Political Science University of Pittsburgh Pittsburgh, PA Coordination and coherence are familiar themes in the discussion of shortcomings of public administration and public policy. Governments have long sought to discover means of making the policies adopted in one department or agency...»

«International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.2, March 2012 DYNAMIC POLICY MANAGEMENT IN MOBILE GRID ENVIRONMENTS Tariq Alwada’n1, Hamza Aldabbas1, Helge Janicke1,Thair Khdour2, Omer Aldabbas3, Faculty of Technology, De Montfort University, UK {tariq,heljanic,hamza}@dmu.ac.uk Department of Information Technology, AlBalqa Applied University, Jordan khdour@bau.edu.jo Department of Engineering, AlBalqa Applied University, Jordan omer_aldabbas@yahoo.com ABSTRACT...»

<<  HOME   |    CONTACTS
2016 www.theses.xlibx.info - Theses, dissertations, documentation

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.