A corpus-based study of speech, thought and writing presentation in contemporary spoken British English

The Lancaster Speech, Writing and Thought Presentation Spoken Corpus has been built as part of an AHRB-funded project to investigate the nature of speech, writing and thought presentation (SW&TP) in contemporary spoken British English. The corpus consists of approximately 260,000 words and has been tagged manually using a speech, writing and thought presentation tag set based upon that first put forward in Leech and Short (1981) 'Style in Fiction' (Longman) and further developed in our project to annotate a corpus of written British English (1995-9) and described in Semino and Short (2004) 'Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing' (Routledge).

The corpus consists of 120 samples of discourse-presentation rich data extracted equally from the spoken part of the British National Corpus (free conversation) and from a set of interviews contained in the oral history archive at the Centre for North West regional Studies at Lancaster University. The extracts are balanced for sex and for age of the speakers.

The corpus has been deposited with the Oxford text Archive and we are currently beginning analysis to compare the findings with those for the written corpus.

arts-humanities.net

Principal investigator
Professor Mick Short
Principal project staff
Dr Elena Semino; Professor Mick Short; Professor Tony McEnery
Start date
Monday, October 1, 2001
Completion date
Saturday, February 1, 2003
Era
Place
Source material
Digitised sound files in the spoken part of the BNC (60 extracts of free conversation normalised for sex and demographics) and digitised sound files from the oral history archive in Lancaster University's Centre for North Western regional Studies (CNWRS: 60 extracts normalised for sex). These sound files had accompanying transcription which we made more accurate before the process of annotation began. The trnscriptions were then annotated for categories of speech, thought and writing presentation.
Publications

Most of the publications below arise from the written corpus project. The only publication so far produced from the spoken corpus project is McIntyre, D., Bellard-Thomson, C., Heywood, J., McEnery, A., Semino, E. and Short, M. (2004).

Authored books

Semino, E. and Short, M. (forthcoming) Corpus Stylistics: The Presentation of Speech, Writing and Thought in a Corpus of English Writing. London: Routledge.

Journal articles

McIntyre, D., Bellard-Thomson, C., Heywood, J., McEnery, A., Semino, E. and Short, M. (2004) 'Investigating the presentation of speech, thought and writing in spoken British English: a corpus-based approach.' International Journal of Corpus Linguistics.

Semino, E., Short, M. and Culpeper, J. (1997) ‘Using a corpus to test and refine a model of speech and thought presentation.’ Poetics 25. 17-43.

Semino, E., Short, M. and Wynne, M. (1999) ‘Hypothetical words and thoughts in contemporary British narratives.’ Narrative 7:3. 307-34.

Short, M., Semino, E. and Wynne, M. (1997) ‘A (free direct) reply to Paul Simpson’s discourse.’ Journal of Literary Semantics. 26:3. 219-28

Short, M., Semino, E. and Wynne, M. (2002) ‘Revisiting the notion of faithfulness in discourse report/(re)presentation using a corpus approach.’ Language and Literature 11:4. 325-55.

Short, M. Wynne, M. and Semino, E. (1998) ‘Reading reports: discourse presentation in a corpus of narratives, with special reference to news reports.’ Anglistik & Englischunterricht. 39-65.

Book chapters

Leech, G., McEnery, A. and Wynne, M. (1997) ‘Further levels of annotation.’ in Garside, R., Leech, G. and McEnery, A. (eds) Corpus Annotation. London: Longman. 85-101.

Short, M. (forthcoming) ‘A corpus-based approach to speech, thought and writing presentation.’ in Wilson, A., Rayson, P. and McEnery, A. (eds) Corpus Linguistics by the Lune: A Festschrift for Geoffrey Leech. Frankfurt/Main: Peter Lang.

Short, M. (2001) ‘Revisiting the notion of faithfulness in discourse: report/(re)presentation using a corpus approach.’ in Biermann, I. and Combrink, A. (eds) Poetics, Linguistics and History: Discourses of War and Conflict. Potchefstroom University Press: Potchefstroom, South Africa.

Short, M., Semino, E. and Culpeper, J. (1996) ‘Using a corpus for stylistics research: speech and thought presentation.’ in Short, M. and Thomas, J. (eds) Using Corpora in Language Research. London: Longman. 110-31.

Short, M., Wynne, M. and Semino, E. (1999) ‘Reading reports: discourse presentation in a corpus of narratives, with special reference to news reports.’ in Diller, H. J. and E. O. Gert Stratmann (eds) English via Various Media. Heidelberg: Universitatsverlag C. Winter. 39-66.

Wynne, M., Short, M. and Semino, E. (1998) ‘A corpus-based investigation of speech, thought and writing presentation in English narrative texts.’ in Renouf, A. (ed.) Explorations in Corpus Linguistics. Amsterdam: Rodopi. 231-45.

Conference proceedings

McIntyre, D., Bellard-Thomson, C., Heywood, J., McEnery, A., Semino, E. and Short, M. (2003) 'The construction of a corpus to investigate the presentation of speech, thought and writing in written and spoken British English.' in Archer, D., Rayson, P., Wilson, A. and McEnery, A. (eds) Proceedings of the Corpus Linguistics 2003 Conference. Lancaster University: UCREL Technical Papers 16. 513-23.