Thursday, January 05, 2012

Idea for a Corpus-Like Assignment investigating variations in English(es)

1. Start with Paul Brians' classic website "Common Errors in English."

2. Identify some potential "semantic minimal pairs"(a term I thought I just invented but apparently did not) that seem interesting, like "good-by / good-bye" or "gray/grey" or some more obscure and/or clearly "errory" ones like "easedrop" vs "eavesdrop".

(Note: this is pretty subjective and you have to already know a lot about English to really choose interesting ones.)

3. Find a corpus courtesy of Brian Lee's list of "Freely Accessible Online Corpora of English"

4. Learn how to search it, use it, etc.

5. search for the stuff you found in #2.

6. Somehow do some kind of cool analysis of what you find.

(#6 needs some work.)

