Superlinguo

For those who like and use language

161 notes &

Fun times are on the up.
I’m not a corpus linguist, but I love playing with different corpora when they’re presented in accessibly and fun ways - so I was thrilled when Claire Hardaker tweeted about the NYT Chronicle, a way to visualise the language used across the newspaper’s history. 
Like Google’s n-gram corpus, it presents a nice clear chart. It has some advantages over n-gram, for example the NYT corpus is completely up to date while Google’s gets sketchy for contemporary references; compare NYT drone to n-gram drone and you see the NYT data kicks up swiftly just where the Google data ends. 
There are obviously biases in this data too. For one, there’s a bias towards American spelling that isn’t as pronounced in the Google Books corpus. The genre represented is also fairly narrow.
I found a nice use for it the other day while listening to a This American Life podcast that talked about “the meat question”; a period in the late 19th and early 20th century when the USA was unsure it would have enough viable agriculture to feed its population and looked at alternative sources of meat (including, most famously, hippopotamus). The NYT Chronicle has a nice couple of spikes in usages of this phrase when the issue was most pressing (and therefore made it into the news), while the Google Books usage is more diffuse, as people wrote books in the aftermath, being a corpus that is less immediate than newspapers.
This may not become my default go-to tool, but it’s nice and simple and makes a great point of comparison to n-gram. Thanks Claire for sharing!

Fun times are on the up.

I’m not a corpus linguist, but I love playing with different corpora when they’re presented in accessibly and fun ways - so I was thrilled when Claire Hardaker tweeted about the NYT Chronicle, a way to visualise the language used across the newspaper’s history. 

Like Google’s n-gram corpus, it presents a nice clear chart. It has some advantages over n-gram, for example the NYT corpus is completely up to date while Google’s gets sketchy for contemporary references; compare NYT drone to n-gram drone and you see the NYT data kicks up swiftly just where the Google data ends. 

There are obviously biases in this data too. For one, there’s a bias towards American spelling that isn’t as pronounced in the Google Books corpus. The genre represented is also fairly narrow.

I found a nice use for it the other day while listening to a This American Life podcast that talked about “the meat question”; a period in the late 19th and early 20th century when the USA was unsure it would have enough viable agriculture to feed its population and looked at alternative sources of meat (including, most famously, hippopotamus). The NYT Chronicle has a nice couple of spikes in usages of this phrase when the issue was most pressing (and therefore made it into the news), while the Google Books usage is more diffuse, as people wrote books in the aftermath, being a corpus that is less immediate than newspapers.

This may not become my default go-to tool, but it’s nice and simple and makes a great point of comparison to n-gram. Thanks Claire for sharing!

  1. linguistika reblogged this from superlinguo
  2. languagevillage reblogged this from allthingslinguistic
  3. chthonichellbeast reblogged this from allthingslinguistic
  4. bobcatmoran reblogged this from everbright-mourning
  5. everbright-mourning reblogged this from madmaudlingoes
  6. madmaudlingoes reblogged this from allthingslinguistic
  7. slysdexicdarapox reblogged this from allthingslinguistic
  8. chronoptimism reblogged this from allthingslinguistic
  9. echowavelength reblogged this from allthingslinguistic
  10. ocyrhoe reblogged this from allthingslinguistic
  11. drlibertybell reblogged this from allthingslinguistic and added:
    Fun Times.
  12. abyjezykgietki reblogged this from allthingslinguistic