The Parsed Corpus of Early English Correspondence
The Parsed Corpus of Early English Correspondence is a syntactically-annotated version of 2.2 million words of the Corpus of Early English Correspondece (created by the Sociolinguistics and Language History project team at the Department of English, University of Helsinki). It includes 84 letter collections, consisting of 4790 letters dating from 1410 to 1695. The corpus is annotated with the grammatical and sociolinguistic information necessary for extensive (socio-)linguistic analysis. The corpus can be searched automatically for abstract grammatical structures (such as relative clauses, subject-verb inversion, expletive subjects, etc.), as well as (strings of) words, allowing quick and easy access to the data necessary to investigate virtually any aspect of the language of the period. In addition each sentence is accompanied by searchable information on the writer and recipient (name, gender, relationship to sender/receiver, date of birth, age at time of writing) and the letter (date, authenticity), allowing sociolinguistic investigations of the type commonly carried out on modern languages. In addition, the genre of the corpus, personal letters, yields language closer to the spoken idiom, and thus supplies a valuable corrective to work based on the more usual literary data. As part of a series of annotated corpora which together cover the entire history of English, the corpus can also be used in the study of long-term changes in the history of English.