«Test-task validation has been an important strand in recent revision projects for University of Cambridge Local Examinations Syndicate (UCLES) ...»
O’Sullivan, B. 2000: Towards a model of performance in oral language testing. Unpublished PhD dissertation, CALS, University of Reading.
Robinson, P. 1995: Task complexity and second language narrative discourse. Language Learning 45, 99–140.
Ross, S. and Berwick, R. 1992: The discourse of accommodation in oral pro ciency interviews. Studies in Second Language Acquisition 14, 159–76.
Saville, N. and Hargreaves, P. 1999: Assessing speaking in the revised FCE. ELT Journal 53, 42–51.
Schegloff, E., Jefferson, G. and Sachs, H. 1977: The preference for selfcorrection in the organisation of repair in conversation. Language 53, 361–82.
Schwartz, J. 1980: The negotiation for meaning: repair in conversations between second language learners of English. In Larsen-Freeman, D.,
editor, Discourse analysis in second language research. Rowley, MA:
Shohamy. E. 1983. The stability of oral language pro ciency assessment in the oral interview testing procedure. Language Learning 33, 527 –40.
—— 1988: A proposed framework for testing the oral language of Barry O’Sullivan, Cyril J. Weir and Nick Saville second/foreign language learners. Studies in Second Language Acquisition 10, 165–79.
—— 1994: The validity of direct versus semi-direct oral tests. Language Testing 11, 99–123.
Shohamy, E., Reves, T. and Bejarano, Y. 1986: Introducing a new comprehensive test of oral pro ciency. ELT Journal 40, 212–20.
Skehan, P. 1996: A framework for the implementation of task based instruction. Applied Linguistics 17, 38–62.
—— 1998: A cognitive approach to language learning. Oxford: Oxford University Press.
Stans eld, C.W. and Kenyon, D.M. 1992: Research on the comparability of the oral pro ciency interview and the simulated oral pro ciency interview. System 20, 347–64.
Stenstrom, A. 1994: An introduction to spoken interaction. London: Long¨ man.
Suhua H. 1998: A communicative test of spoken English for the CET 6.
Unpublished PhD Thesis, Shanghai Jiao Tong University, Shanghai.
Upshur, J.A. and Turner, C. 1999: Systematic effects in the rating of second-language speaking ability: test method and learner discourse.
Language Testing 16, 82–111.
van Ek, J.A. and Trim J.L.M., editors, 1984: Across the threshold.
van Lier, L. 1989: Reeling, writhing, drawling, stretching, and fainting in coils: oral pro ciency interviews as conversation. TESOL Quarterly, 23, 489 –508.
Walker, C. 1990: Large-scale oral testing. Applied Linguistics 11, 200–19.
Weir, C.J. 1983: Identifying the language needs of overseas students in tertiary education in the United Kingdom. Unpublished PhD thesis, University of London.
—— 1993: Understanding and developing language tests. Hemel Hempstead: Prentice Hall.
Wigglesworth, G. 1997: An investigation of planning time and pro ciency level on oral test discourse. Language Testing 14, 85–106.
Wigglesworth, G. and O’Loughlin, K. 1993: An investigation into the comparability of direct and semi-direct versions of an oral interaction test in English. Melbourne Papers in Language Testing 2, 56–67.
Young, R. 1995: Conversational styles in language pro ciency interviews.
Language Learning 45, 3–42.
Young, R. and Milanovic, M. 1992: Discourse variation in oral pro ciency interviews. Studies in Second Language Acquisition 14, 403–24.
52 Validating speaking-test tasks Appendix 1 Items included in initial draft checklists (with short gloss)
Participants Make excuses Terminate Conversational repair Summarize Complain Paraphrase Persuade Change topic Challenge Qualify Ask for info Suggest Narrate Reciprocate Analyse Elaborate Initiate Provide nonpersonal information Explain Justify opinions Negotiate meaning Decide (Dis) agree Justify/Support Ask for opinions Express preferences Speculate Compare Barry O’Sullivan, Cyril J. Weir and Nick Saville Provide nonpersonal information Express opinion 54 Validating speaking-test tasks Appendix 3 Operational checklist (used in Phase 3)
Notes: The gures indicate the number of students that complete the task in each case. L: Little agreement; S: Some agreement; G: Good aggreement. For Tasks 3 and 4 in the rst tape observed, the maximum was 9; for all others the maximum was 12. This is because 3 of the 12 MA students did not complete the task for these last 2 tasks. This was not a problem during the observation of the second tape, so for all the maximum gures are 12.
56 Validating speaking-test tasks Appendix 5 Transcript results and observation checklist results
Notes: T indicates that this function has been identi ed as occurring in the transcript of the interaction. L, S and G indicate the degree of agreement among the raters using the checklists in real time (L: Little agreement; S: Some agreement; G: Good agreement).