I can probably be most useful for Romantic-era scholars who start into DH and then discover that the OCR quality of most texts is so bad as to make DH work very difficult. I have a Python script that is pretty good at cleaning up 19c OCR, and I could also share a collection of about 1,000 19c texts that I have already cleaned up (in collaboration with Jordan Sellers). These might provide a starting point for your project.