FREE ELECTRONIC LIBRARY - Theses, dissertations, documentation

Pages:     | 1 | 2 || 4 |

«Test-task validation has been an important strand in recent revision projects for University of Cambridge Local Examinations Syndicate (UCLES) ...»

-- [ Page 3 ] --

There are still some problems in items such as ‘staging’ and ‘describing’, and feedback from participants suggests that this may be due to misunderstandings or misinterpretations of the gloss and examples used. In addition, there are some similar dif culties with the initial three items in the interactional functions checklist, in which the greatest dif culties in applying the checklists appear to lie.

VII Discussion and initial conclusions The results of this study appear to substantiate our belief that, although still under development for use with the UCLES Main Suite examinations, an operational version of these checklists is certainly feasible, and has potentially wider application, mutatis mutandis, to the content validation of other spoken language tests. Further re nement of the checklists is clearly required, although the developmental process adopted here appears to have borne positive results.

1 Validities We would not wish to claim that the checklists on their own offer a satisfactory demonstration of the construct validity of a spoken language test, for, as Messick argues (1989: 16): ‘the varieties of evidence supporting validity are not alternatives but rather supplements to one

another.’ We recognize the necessity for a broad view of ‘the evidential basis for test interpretation’ (Messick, 1989: 20). Bachman (1990:

237) similarly concludes: ‘it is important to recognise that none of these [evidences of validity] by itself is suf cient to demonstrate the validity of a particular interpretation or use of test scores’ (see also Bachman, 1990: 243). Fulcher (1999: 224) adds a further caveat against an overly narrow interpretation of content validity when he

quotes Messick (1989: 41):

the major problem is that so-called content validity is focused upon test forms rather than test scores, upon instruments rather than measurements... selecting content is an act of classi cation, which is in itself a hypothesis that needs to be con rmed empirically.

46 Validating speaking-test tasks Like these authors, we regard as inadequate any conceptualization of validity that does not involve the provision of evidence on a number of levels, but would argue strongly that without a clear idea of the match between intended content and actual content, any comprehensive investigation of the construct validity of a test is built on sand.

De ning the construct is, in our view, underpinned by establishing the nature of the actual performances elicited by test tasks, i.e. the true content of tasks.

2 Present and future applications of observational checklists Versions of the checklists require a degree of training and practice similar to that given to raters if a reliable and consistent outcome is to be expected. This requires that standardized training materials be developed alongside the checklists. In the case of these checklists, this process has already begun with the initial versions piloted during Phase 3 of the project.

The checklists have great potential as an evaluative tool and can provide comprehensive insight into various issues. It is hoped that,

amongst other issues, the checklists will provide insights into the following:

· the language functions that the different task-types (and different sub-tasks within these) employed in the UCLES Main Suite Paper 5 (Speaking) Tests typically elicit;

· the language that the pair-format elicits, and how it differs in nature and quality from that elicited by interlocutor-single candidate testing;

· the extent to which there is functional variation across the top four levels of the UCLES Main Suite Spoken Language Test.

In addition to these issues, the way in which the checklists can be applied may allow for other important questions to be answered. For example, by allowing the evaluator multiple observations (stopping and starting a recording of a test at will), it will be possible to establish whether there are quanti able differences in the language functions generated by the different tasks; i.e., the evaluators will have the time they need to make frequency counts of the functions.

While the results to date have focused on a posteriori validation procedures, these checklists are also relevant to task design. By taking into account the expected response of a task (and by describing that response in terms of these functions) it will be possible to explore predicted and actual test task outcome. It will also be a useful guide for item writers in taking a priori decisions about content coverage.

Through this approach it should be possible to predict more accurately Barry O’Sullivan, Cyril J. Weir and Nick Saville linguistic response (in terms of the elements of the checklists) and to apply this to the design of test tasks – and of course to evaluate the success of the prediction later on. In the longer term this will lead to a greater understanding of how tasks and task formats can be manipulated to result in speci c language use. We are not claiming that it is possible to predict language use at a micro level (grammatical form or lexis), but that it is possible to predict informational and interactional functions and features of interaction management – a notion supported by Bygate (1999).

The checklists should also enable us to explore how systematic variation in such areas as interviewer questioning behaviour (and interlocutor frame adherence) affects the language produced in this type of test. In the interview transcribed for this study, for example, the examiner directed his questions very deliberately (systematically aiming the questions at one participant and then the other). This tended to sti e any spontaneity in the intended three-way discussion (Task 4), so occurrences of Interactional and Discourse Management Functions did not materialize to the extent intended by the task designers. It is possible that a less deliberate (unscripted) questioning technique would lead to a less interviewer-oriented interaction pattern and allow for the more genuine interactive communication envisaged in the task design.

Perhaps the most valuable contribution that this type of validation procedure offers is its potential to improve the quality of oral assessment in both low-stakes and high-stakes contexts. By offering the investigator an instrument that can be used in real time, the checklists broaden the scope of investigation from limited case study analysis of small numbers of test transcripts to large scale eld studies across a wide range of testing contexts.


We would like to thank Don Porter and Rita Green for their early input into the rst version of the checklist. In addition, help was received from members of the ELT division in UCLES, in particular from Angela ffrench, Lynda Taylor and Christina Rimini, from a group of UCLES Senior Team Leaders and from MA TEFL students at the University of Reading. Finally, we would like to thank the editors and anonymous reviewers of Language Testing for their insightful comments and helpful suggestions for its improvement. The faults that remain are, as ever, ours.

48 Validating speaking-test tasks

VIII References

Anastasi, A. 1988: Psychological testing. 6th edition. New York: Macmillan.

Bachman, L.F. 1990: Fundamental considerations in language testing.

Oxford: Oxford University Press.

Bachman, L.F. and Palmer, A.S. 1981: The construct validation of the FSI oral interview. Language Learning 31, 67–86.

—— 1996: Language testing in practice. Oxford: Oxford University Press.

Ballman, T.L. 1991: The oral task of picture description: similarities and differences in native and nonnative speakers of Spanish. In Teschner, R.V., editor, Assessing foreign language pro ciency of undergraduates. AAUSC Issues in Language Program Direction. Boston: Heinle and Heinle, 221–31.

Brown, A. 1998: Interviewer style and candidate performance in the IELST oral interview. Paper presented at the Language Testing Research Colloquium, Monterey, CA.

Bygate, M. 1988: Speaking. Oxford: Oxford University Press.

—— 1999: Quality of language and purpose of task: patterns of learners’ language on two oral communication tasks. Language Teaching Research 3, 185–214.

Chalhoub-Deville, M. 1995a: Deriving oral assessment scales across different tests and rater groups. Language Testing 12, 16–33.

—— 1995b: A contextualized approach to describing oral language prociency. Language Learning 45, 251–81.

Clark, J.L.D. 1979: Direct vs. semi-direct tests of speaking ability. In Briere, E.J. and Hinofotis, F.B., editors, Concepts in language testing:

some recent studies. Washington DC: TESOL.

—— 1988: Validation of a tape-mediated ACTFL/ILR scale based test of Chinese speaking pro ciency. Language Testing 5, 187–205.

Clark, J.L.D. and Hooshmand, D. 1992: ‘Screen to Screen’ testing: an exploratory study of oral pro ciency interviewing using video teleconferencing. System 20, 293 –304.

Cronbach, L.J. 1971: Validity. In Thorndike, R.L., editor, Educational measurement. 2nd edition. Washington DC: American Council on Education, 443–597.

—— 1990: Essentials of psychological testing. 5th edition. New York:

Harper & Row.

Davies, A. 1977: The construction of language tests. In Allen, J.P.B. and Davies, A., editors, Testing and experimental methods. The Edinburgh Course in Applied Linguistics, Volume 4. London: Oxford University Press, 38–194.

—— 1990: Principles of language testing. Oxford: Blackwell.

Ellerton, A.W. 1997: Considerations in the validation of semi-direct oral testing. Unpublished PhD thesis, CALS, University of Reading.

ffrench, A. 1999: Language functions and UCLES speaking tests. Seminar in Athens, Greece. October 1999.

Barry O’Sullivan, Cyril J. Weir and Nick Saville Foster, P. and Skehan, P. 1996: The in uence of planning and task type on second language performance. Studies in Second Language Acquisition 18, 299–323.

—— 1999: The in uence of source of planning and focus of planning on task-based performance. Language Teaching Research 3, 215–47.

Fulcher, G. 1994: Some priority areas for oral language testing. Language Testing Update 15, 39–47.

—— 1996: Testing tasks: issues in task design and the group oral. Language Testing 13, 23–51.

—— 1999: Assessment in English for academic purposes: putting content validity in its place. Applied Linguistics 20, 221–36.

Hayashi, M. 1995: Conversational repair: a contrastive study of Japanese and English. MA Project Report, University of Canberra.

Henning, G. 1983: Oral pro ciency testing: comparative validities of interview, imitation, and completion methods. Language Learning 33, 315 –32.

—— 1987: A guide to language testing. Cambridge, MA: Newbury House.

Kelly, R. 1978: On the construct validation of comprehension tests: an exercise in applied linguistics. Unpublished PhD thesis, University of Queensland.

Kenyon, D. 1995: An investigation of the validity of task demands on performance-based tests of oral pro ciency. In Kunnan, A.J., editor, Validation in language assessment: selected papers from the 17th Language Testing Research Colloquium, Long Beach. Mahwah, NJ: Lawrence Erlbaum, 19–40.

Kormos, J. 1999: Simulating conversations in oral-pro ciency assessment:

a conversation analysis of role plays and non-scripted interviews in language exams. Language Testing 16, 163–88.

Lazaraton, A. 1992: The structural organisation of a language interview: a conversational analytic perspective. System 20, 373–86.

——1996: A qualitative approach to monitoring examiner conduct in the Cambridge assessment of spoken English (CASE). In Milanovic, M.

and Saville, N., editors, Performance testing, cognition and assessment: selected papers from the 15th Language Testing Research Colloquium, Cambridge and Arnhem. Studies in Language Testing 3.

Cambridge: University of Cambridge Local Examinations Syndicate, 18–33.

—— 2000: A qualitative approach to the validation of oral language tests.

Studies in Language Testing, Volume 14. Cambridge: Cambridge University Press.

Lumley, T. and O’Sullivan, B. 2000: The effect of speaker and topic variables on task performance in a tape-mediated assessment of speaking.

Paper presented at the 2nd Annual Asian Language Assessment Research Forum, The Hong Kong Polytechnic University.

Luoma, S. 1997: Comparability of a tape-mediated and a face-to-face test of speaking: a triangulation study. Unpublished Licentiate Thesis, Centre for Applied Language Studies, Jyvaskyla University, Finland.

¨ ¨ 50 Validating speaking-test tasks

McNamara, T. 1996: Measuring second language performance. London:


Mehnert, U. 1998: The effects of different lengths of time for planning on second language performance. Studies in Second Language Acquisition 20, 83–108.

Messick, S. 1975: The standard problem: meaning and values in measurement and evaluation. American Psychologist 30, 955–66.

—— 1989: Validity. In Linn, R.L., editor, Educational measurement. 3rd edition. New York: Macmillan.

Milanovic, M. and Saville, N. 1996: Introduction. Performance testing, cognition and assessment. Studies in Language Testing, Volume 3. Cambridge: University of Cambridge Local Examinations Syndicate, 1–17.

Moller, A. D. 1982: A study in the validation of pro ciency tests of English as a Foreign Language. Unpublished PhD thesis, University of Edinburgh.

Norris, J, Brown, J. D., Hudson, T. and Yoshioka, J. 1998: Designing second language performance assessments. Technical Report 18.

Honolulu, HI: University of Hawaii Press.

O’Loughlin, K. 1995: Lexical density in candidate output on direct and semi-direct versions of an oral pro ciency test. Language Testing 12, 217–37.

—— 1997: The comparability of direct and semi-direct speaking tests: a case study. Unpublished PhD Thesis, University of Melbourne, Melbourne.

—— 2001: An investigatory study of the equivalence of direct and semidirect speaking skills. Studies in Language Testing 13. Cambridge:

Cambridge University Press/UCLES.

Ortega, L. 1999: Planning and focus on form in L2 oral performance. Studies in Second Language Acquisition 20, 109–48.

Pages:     | 1 | 2 || 4 |

Similar works:

«BIG HORN BASIN ANGLER NEWSLETTER FEBRUARY 2001 QUOTABLE QUOTES:“ALL FISH ARE CREATED EQUAL, BUT SOME ARE MORE EQUAL THAN OTHERS” The quote is a play on words from a passage in George Orwell’s The Animal Farm, which reads “all animals are created equal, but some are more equal than others.” The story is about barnyard animals that take over a farm. In setting up an animal government they conclude that all animals should be equal; however, over time it becomes apparent that some are...»

«1 English for Specific Purposes World, ISSN 1682-3257, http://www.esp-world.info, Issue 41, Vol. 14, 2013 First Author:Thao Q. Tran, a Ph.D. student at Suranaree University of Technology, Thailand, has been teaching English for more than ten years in Vietnam. His research interest is in the relationship between language and culture, intercultural competence in ELT, co-teaching in ELT, language learning strategies, and language teaching methodology, etc. Email: Tranquocthaobmtc@yahoo.com,...»

«FOLKLORE RAUL CHULIVER Centro de Estudios Folklóricos FOLKLORE Autor: RAUL CHULIVER Tapa: Pintura en óleo Titulo: “Coya” Cuadro: 55 x 85 cm Autora: Estela B. Altieri Año 2012 Centro de Estudios Folklóricos Este libro está dirigido a todo público en general, amante del Folklore y para la enseñanza del folklore en las escuelas. Los años sesenta fueron tiempos de notorio fervor por las expresiones populares y tradicionales de las diversas regiones, hoy llego como intérprete de ese...»

«Taking Your Administration to a Great State Hotel, Airport, Transportation, Attractions, and General Information Hotel: Hyatt Regency Austin 208 Barton Springs Road Austin, Texas 78704 Tel: 512-477-1234 Fax: 512-480-2069 http://austin.hyatt.com/hyatt/hotels/index.jsp Rate: $146/night plus taxes (15% state and local taxes); Total: $167.90. Upgrades are available based on availability. Rate is good for 3 days prior and 3 days after the Conference, subject to availability. Reservations:...»

«National Motor Vehicle Title Information System (NMVTIS) ADVISORY BOARD MEETING Bureau of Justice Assistance Office of Justice Programs Arlington, VA July 13, 2011 The NMVTIS Advisory Board convened its fourth meeting on July 13, 2011, at the Crystal City Hilton Hotel in Arlington, VA. The following individuals were in attendance: Chair Major Greg Terp Karen Grim Miami-Dade Police Department Virginia Department of Motor Vehicles Designated Federal Official (DFO) Alissa Huntoon Van Guillotte...»

«APPENDIX 1 Minerals in Meteorites Minerals make up the hard parts of our world and the Solar System. They are the building blocks of all rocks and all meteorites. Approximately 4,000 minerals have been identified so far, and of these, ~280 are found in meteorites. In 1802 only three minerals had been identified in meteorites. But beginning in the 1960s when only 40–50 minerals were known in meteorites, the discovery rate greatly increased due to impressive new analytic tools and techniques....»

«Was sind Ihre Forschungsdaten? Interviews mit Wissenschaftlern der Humboldt-Universität zu Berlin Dezember 2014 Version 1.0 Zitierungsvorschlag: Simukovic, Elena; Thiele, Raphael; Struck, Alexander; Kindling, Maxi; Schirmbacher, Peter (2014): Was sind Ihre Forschungsdaten? Interviews mit Wissenschaftlern der Humboldt-Universität zu Berlin. Bericht, Version 1.0. Online verfügbar unter: urn:nbn:de:kobv:11-100224755 Computerund Medienservice Institut für Bibliotheksund Informationswissenschaft...»

«Faculdade de Ciências da Universidade de Lisboa Departamento de Geologia (Paleo)ecology of coccolithophores in the submarine canyons of the central Portuguese continental margin: environmental, sedimentary and oceanographic implications Catarina Alexandra Vicente Guerreiro Doutoramento em Geologia Especialidade em Paleontologia e Estratigrafia Faculdade de Ciências da Universidade de Lisboa Departamento de Geologia (Paleo)ecology of coccolithophores in the submarine canyons of the central...»

«TURN A PALLETE INTO A VERTICAL GARDEN   adapted from lifeonthebalcony.com Find a Pallet The first thing you need to do is–obviously–find a pallet. I’ve had good luck finding them in dumpsters behind supermarkets. No need to be squeamish. It doesn’t smell. At least, it doesn’t smell that bad. Don’t just take the first pallet you find. You’re looking for one with all the boards in good condition, no nails sticking out, no rotting, etc. If you intend to put edibles in your pallet,...»

«-1Personal Protective Equipment (PPE) (Enter Company Name) hereinafter referred to as The Company” concerned about the protection of its employees from occupational injuries and illnesses. All employees of The Company have and assume the responsibility of working safely. The objective of this program is to: • Provide safety standards specifically designed to cover Personal Protective Equipment (PPE).• Ensure that each employee is trained and made aware of the safety procedures which are...»

«Case 6:08-cv-01362-MLB Document 41 Filed 05/10/10 Page 1 of 32 IN THE UNITED STATES DISTRICT COURT FOR THE DISTRICT OF KANSAS E. JEANNE DRAKE, ) ) Plaintiff, ) CIVIL ACTION ) v. ) No. 08-1362-MLB-KMH ) ARLIS JON “A.J.” WUTHNOW, ) individually and as Sheriff of ) Harvey County, KS and the BOARD OF ) COUNTY COMMISSIONERS OF HARVEY ) COUNTY, KS, ) ) Defendants. ) ) MEMORANDUM AND ORDER This case comes before the court on defendants’ motion for summary judgment. (Doc. 32). The motion has been...»

«One Contentment Cannot Be Found “Under the Sun” A s a nineteen-year-old woman at Northwestern University, I had a three-part plan A for fulfillment: 1. I would get thin and gorgeous so that 2. I could snare a sexy man with a bright financial future so that 3. we could raise beautiful children in an elaborate house overlooking the Pacific Ocean. If you have the broad perspective of life—that is, a perspective that includes God and eternity—you realize how narrow my perspective was. You...»

<<  HOME   |    CONTACTS
2016 www.theses.xlibx.info - Theses, dissertations, documentation

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.