Let's say you've found a new word. You've looked it up in a dictionary and found an example sentence or two. Perhaps you've also found what other words it collocates with. (I'll be looking at some online tools to find collocates, in another post). What you'd like to do now is to see how it's used in context.
One possibilty is to use a corpus, ("a collection of samples of real-world texts stored on computer. Plural - corpora" - Leoxicon), but these can sometimes be difficult to use, and when they include spoken language, the grammar is occasionally "non-standard", let's say. The British National Corpus is easy to use, but be careful with examples from spoken language.
Another way is to do a simple Google (or other) search. The Internet is one enormous corpus if you think about it, although no linguist has "collected" these examples. But a simple search can bring up a lot of irrelevant material, and again you're not really assured of grammatical correctness.
What I like to do is a Google site search of trusted newspapers and other websites, which are in effect small corpora, or look in Google Books, where the material has been edited and proofread, so is likely to be grammatically correct.
With a Google site search, you put in your search term, (which I like to put in inverted commas so that it only looks for these words when they are together), followed by site: and the address of the website (without http://). So if I wanted to find examples of highly unlikely on the Guardian website, I'd enter:
- "highly unlikely" site:www.guardian.co.uk
To make things easier, I've put together a simple tool to look up words and expressions on various newspaper sites, etc. Just enter a word or expression into the Entry Box and click on one of the links. (Try it with the examples). I'll probably add some more sites to it later.
A note on books - clicking on Google Books searches all books digitised by Google. There is also a facility for doing an advanced search of Google Books here. You can for example choose to search only modern books. Project Gutenberg is a digital collection of out of copyright books, so it has all the classics but few modern books.
British quality press | British tabloids | American press |
The Guardian | The Daily Mail | New York Times |
The Independent | The Daily Express | Washington Post |
The Telegraph | The Mirror | Herald-Tribune |
The Times | The Sun | Chicago Tribune |
The Financial Times | San Francisco Chronicle | |
The Economist | Los Angeles Times | |
Miami Herald | ||
Wall Street Journal | ||
Huffington Post | ||
Time Magazine | ||
Enter expression: | ||
Broadcasters | ||
BBC | BBC News | |
Magazines | ||
National Geographic | Geographical (UK) | Discover Wildlife (BBC) |
History Today | History Extra (BBC) | History Channel |
New Scientist | Scientific American | Science Focus (BBC) |
Books | ||
Project Gutenberg | Google Books |
Links
- Leoxicon - a blog dedicated to the use of corpora in the teaching of English to foreign students.
I have removed a comment which has a suspicious looking link which had nothing to do with this post. Sorry, but spam will be removed.
ReplyDeleteReally useful tips!! Thanks a lot for sharing!
ReplyDeletepattel02 - your comment has been removed for being pure spam
ReplyDelete