Text Mining of Television Transcripts

I am looking for a collaborator with computational linguistic skills for a project mining the dialogue of the U.S. television program Supernatural (CW Network, 2005-present). My goal is to demonstrate, through textual analysis, the originality of the dialogue, the breadth of words and phrases used by the writers, the way language is used to distinguish characters and reveal character traits, etc.The product of this project will be an article for publication in a peer-reviewed venue.

A chapter that I've written about my exploration of this project thus far is forthcoming in Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists(Chicago: Association of College and Research Libraries), March 2015. That chapter documents my process of creating the corpora from fan-created transcripts, testing and selecting concordance tools, and examples of the type of results these efforts will produce. It also discusses the limitations of examining only the dialogue in a visual medium and my own limitations as a non-linguist.

My hope is that a partner with the skills I lack will be able to help me with linguistic concepts as well as determine (1) whether there is a way to codify non-verbal action and communication for analysis and (2) whether it would be useful to encode the text for analysis. Interest in or familiarity with Supernatural is a plus.

I am an academic librarian and Associate Professor at the University of Oklahoma with a long history of publishing scholarly work. My CV can be found at https://ou.academia.edu/LiorahGolomb.

Collaboration

Kinds of collaborators
Individual/small group
Faculty
Librarians
Help description
Text mining, computational analysis of text
Contact person
Help needed
Yes

arts-humanities.net

Source material
UTF-8 files created from transcripts

Project Collaborators