Google Ngrams

Reblogged from: https://tekhnologic.wordpress.com/2014/11/06/google-ngrams-the-highs-and-the-lows/

“Google Ngram Viewer” can be used to  check word frequency, look at parts of speech, collocations as well as looking at the differences between American and British English usage.

I enjoy using the Ngram viewer and I think it is a useful tool for teachers and students. It is a site that I have bookmarked for those occasions when I am not sure about a word.

Interesting Charts for IELTS or Business classes

If I type two words separated by a comma, for example:

love,hate

The Google Ngram Viewer produces a chart.

‘Love’ is the blue line, and ‘hate’ is the red line. Now we have an interesting chart we can examine and use to practice the language of explaining charts and graphs.

Click to enlarge Source: http://books.google.com/ngrams
Click to enlarge
Source: http://books.google.com/ngrams

Students can discuss steady increases and rapid declinesa sharp rise and a dramatic fall. However, more importantly, we can use Ngrams to practice inference.

There are three things we can infer from this graph.

  1. People write about love much more than hate, which gives me hope.
  2. People wrote more about love in past than they do today. Though, this may prove to be a false conclusion.
  3. There was a marked decrease in the amount of times the word ‘love’ appeared in the written record in 1918 and 1940. A sobering thought as we approach Remembrance Day.

It is an example however, of how the Ngram viewer can sometimes provide cultural and historical insights.

Comparing Word Frequencies

The Ngram viewer is designed to compare words and their frequency. This is useful for helping us to determine which word or phrase has become more common.

Let’s type in the following words:

global warming,climate change

The Google Ngram Viewer produces a chart.

Click to enlarge Source: http://books.google.com/ngrams
Click to enlarge
Source: http://books.google.com/ngrams

Both terms seem to appear around 1985 and there doesn’t appear to be much difference until the mid-nineties when there is a marked increase in the use of ‘climate change’.

The phrase ‘global warming’ always suggested an increase in temperature, whereas ‘climate change’ could include unusual weather patterns.

Teachers and students now have the tools to check which word is in common usage.

British vs. American English

Try typing:

colour,color

The Google Ngram Viewer produces a chart.

colour color.png

We can now see how the American English spelling came from almost nothing to dominate over the British spelling in terms of frequency.

However, we can also see in more detail which phrases are more popular in each individual version of English.

Try typing:

at school, in school

Then change the ‘from the corpus’ box from ‘English’ to either ‘British English’ or ‘American English’. The Google Ngram Viewer produces two charts.

at-school

in-school

We can see that although in both versions of English, both ‘at school’ and ‘in school’ are used, ‘at’ is more frequently used in Britain, and ‘in’ is more common in North America.

Parts of Speech

Words don’t always have the same job. ‘Love’ is both a noun and a verb. The Ngram viewer will count all instances of the word ‘love‘ unless we tell it to specifically search for nouns or verbs.

Let’s type in the following:

effect_NOUN,effect_VERB,affect_VERB,affect_NOUN

The Google Ngram Viewer produces a chart.

Click to enlarge http://books.google.com/ngrams
Click to enlarge
http://books.google.com/ngrams

By typing underscore + part of speech (_NOUN), we are able to separate words by their different function. A complete list of tags are available on the Ngram viewer’s information page.

The chart shows that ‘effect‘ is usually used as a noun, and ‘affect‘ is usually used as a verb and demonstrates the frequency of their occurrence in the written record.

However, the Ngram viewer doesn’t always account for human error though. It’s important to be aware that the Ngram viewer is an analytical tool not an intuitive one. Accuracy is discussed on the Ngram viewer’s information page.

Collocations

Let’s type in the following:

a bottle of *

The asterisk (*) represents a word that follows the phrase and the Google Ngram Viewer produces a chart of the most common words associated with the phrase ‘a bottle of.

Click to enlarge Source: http://books.google.com/ngrams
Click to enlarge
Source: http://books.google.com/ngrams

‘A bottle of wine’ was the most common by far, but other drinks such as champagne, water, rum and whiskey are shown on the chart.

By searching for the collocations we are able to put the phrases into more context than if we just searched for the word ‘wine.

Combined Searches

The asterisk can be combined with parts of speech, too, so “*_NOUN” will find only the nouns that could appear in the sequence of words you’re searching on.

Now if you type “*_NOUN ‘s theorem” into the Ngram Viewer, you will see a graph with the ten most common names (which count as nouns) that have spawned eponymous theorems — names like Godel, Bayes, and Euler.

A Final Thought

The Ngram viewer can be fun, it can be informative and it can encourage you to think critically about vocabulary. It does have some limitations but overall I think it is a useful tool to be able to refer to.

Thanks for taking the time to read this.

Take care!


Google’s Ngram viewer is best explained in a great TEDx video by two of its creators, Jean-Baptiste Michel and Erez Lieberman Aiden. Subtitles are available in over 30 languages if you download the video fromTED.com. You can also read Google’s information pageabout the Ngram viewer.

Other Links

This isn’t the first post written about Google Ngram Viewer, and it probably won’t be the last. Here are some links you might be interested in.

Larry Ferlazzo talks about Chronicle. The NY Times’ version of Google’s Ngram Viewer. (24/6/2014)

NOTE: I didn’t realise it at the time, but Larry also produced a love/hate chart using chronicle. It is interesting to compare the differences in the data representation.

Larry Ferlazzo‘s collection of posts that discuss Google’s Ngram viewer. (17/12/2010)

____________________________________________

Reblogged from: http://www.theatlantic.com/technology/archive/2013/10/googles-ngram-viewer-goes-wild/280601/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: