Tofu vs. Hotdog: Charting the Evolution of the American Diet Over Time With Google's Latest Gadget

Google Labs has just released the Google Books N-Gram Viewer, and it is fun!

Google Labs has just released the Google Books N-Gram Viewer, and it is fun!

I have to confess to not really understanding the fine mathematical detail behind the term "n-gram," but the resulting graphs simply show the frequency with which words occur within the Google Books database, over time. You can look when and how fast individual words spike and subside in popularity, as well as compare how different words fare against each other, in a reflection of larger linguistic and social changes.

The San Jose Mercury News rounds up some examples of Google Books N-Gram Viewer uses and findings, from detecting trends in censorship to the pace at which the past is being forgotten:

Trends in censorship can be identified through counts of names. For example, Jewish artist Marc Chagall was mentioned just once in the entire German body of literature from 1936 to 1944, even as his prominence in English-language books grew roughly fivefold. Similar suppression is seen in Russian with regard to Leon Trotsky; in Chinese with regard to Tiananmen Square; and in the U.S. with regard to the "Hollywood Ten," a group of entertainers blacklisted in 1947.

Books forget our past faster with each passing year. The Harvard-Google team tracked the frequency with which each year from 1875 to 1975 appeared and found that references to the past decrease much more rapidly now than in the 19th century. References to "1880" didn't fall by half until 1912—a lag of 32 years—but references to "1973" reached half their peak just a decade later, in 1983.


The Google's engineers and Harvard linguists behind the project have described their approach and the resulting insights as a new science of "culturomics," which they define as "the application of high-throughput data collection and analysis to the study of human culture." As an example the Ngram View landing page shows a pre-generated chart comparing "tofu" and "hot dog" occurrences in books published since 1920 (both increase, but tofu races past hot dog in the early 1980s).

Thus encouraged, I compared all sort of food words, to see what culturomics has to teach us about the American diet in print. The slideshow above is the tip of the iceberg—be warned that it is extremely addictive.