Linguistics
About
Collaboration

British Academic Spoken English (BASE) corpus
The project enhances the British Academic Spoken English (BASE) corpus, which functions as a companion to the Michigan Corpus of Spoken Academic English (MICASE), a record of North American academic speech.

The Leeds Archive of Vernacular Culture
The aim of the Leeds Archive of Vernacular Culture (LAVC) project is to unlock the full research potential of the large and varied archive of the University's former Institute of Dialect and Folk Life Studies, now stored in the Brotherton Library’s Special Collections, by creating an innovative electronic resource.

The St Alban's Psalter: on the Web
To digitise the St Albans Psalter and place it on the web. The images are accompanied by complete transcription, translation (Latin into both English and German). Each image has a page-by-page commentary, and the manuscript is amplified by about 40,000 words of accompanying essays.
Aims: to make the psalter available in colour.
Research questions: to understand how the manuscript was made, when, for whom, and why the range of images were chosen.

A corpus-based study of speech, thought and writing presentation in contemporary spoken British English
The Lancaster Speech, Writing and Thought Presentation Spoken Corpus has been built as part of an AHRB-funded project to investigate the nature of speech, writing and thought presentation (SW&TP) in contemporary spoken British English.

Dictionary of the Scots Language
The aim of this project was to create the Dictionary of the Scots Language, an electronic scholarly dictionary covering the Scots language from 1200 to the present. This was successfully completed and published on-line, and serves students of Scottish language, literature and culture around the world. With limited resources and in the short time-scale of three years, the project undertook to digitise and publish in searchable form on the Internet all 11 volumes of the Dictionary of the Older Scottish Tongue and the 10 volumes of the Scottish National Dictionary.

French interlanguage oral corpora
Unlike first language acquisition (L1) research, which has made use of digital technologies for over 20 years to assist its research (in the shape of a powerful suite of software tools for the transcription, analysis and storage of L1 oral learner data, the CHILDES system, now used as standard), the field of second language acquisition (L2) research has been very slow in taking advantage of the new computerised technologies now available.

The Parsed Corpus of Early English Correspondence
The Parsed Corpus of Early English Correspondence is a syntactically-annotated version of 2.2 million words of the Corpus of Early English Correspondece (created by the Sociolinguistics and Language History project team at the Department of English, University of Helsinki). It includes 84 letter collections, consisting of 4790 letters dating from 1410 to 1695. The corpus is annotated with the grammatical and sociolinguistic information necessary for extensive (socio-)linguistic analysis.

ICTGuides
The ICTGuides project is now incorporated within this project (arts-humanities.net).
Two developments gave birth to the ICTGuides database: an increase in the use of ICT in arts and humanities research and an awareness that information on how ICT is used in arts-humanities research is not readily available online. The resulting disparity was largely seen to have detrimental effects on ICT-based scholarship as sharing computational expertise among scholars is a precursor to promoting innovation within the field.

Scottish Corpus of Texts and Speech (SCOTS)
SCOTS uses computer technology and the web to bring a unique electronic collection of Scots and Scottish English texts to scholars and the public. The resource contains written and spoken material, the latter with online audio/video clips, stored in a database along with extensive metadata. Linguists can investigate where particular words and phrases are used, and by whom. Displayed alongside the texts is a range of information about authors and speakers, so that it is possible to search for, e.g., “audio clips featuring Ayrshire women under 40”.

Generic tools for linguistic annotation and web-based analysis of literary Sumerian
The GATE/ETCSL project is based at the University of Sheffield and involves collaboration with the Oriental Institute at the University of Oxford.