Google labs:word frequency in books over the last 200 years(ngrams.googlelabs.com) I was surprised to see the high popularity of the word "fuck" prior to 1820 |
Google labs:word frequency in books over the last 200 years(ngrams.googlelabs.com) I was surprised to see the high popularity of the word "fuck" prior to 1820 |
If we assume all pre-1800ish mentions of 'fuck' are definitely meant to be 'suck', it still features much more prominently in the corpus beforehand than after.
Any ideas why that might be? E.g. certain types of text that were more common before that era, or other (less, er, 'suck'y) types of text that came after, 'diluting' the corpus?
Potentially even more awesome is that they have the entire dataset available for download o_O
edit: case sensitivity is more fun than insensitivity: http://ngrams.googlelabs.com/graph?content=Star+Trek%2Cstar+... vs http://ngrams.googlelabs.com/graph?content=star+trek%2CStar+...
edit2: there are a whole bunch of geek-term bumps around and just after 1900. Anyone know why? E.g.: http://ngrams.googlelabs.com/graph?content=Star+Wars&yea...
http://ngrams.googlelabs.com/graph?content=smartphone&ye...
(Actually, "internet" also has a similar spike. I suspect some books are mislabeled in their dates.)
Perhaps weirder, "Woot": http://ngrams.googlelabs.com/graph?content=Woot&year_sta...
http://ngrams.googlelabs.com/graph?content=liberty&year_...
The long s survives in elongated form, and with an italic-style curled descender, as the integral symbol ∫ used in calculus; Gottfried Wilhelm von Leibniz based the character on the Latin word summa ("sum"), which he wrote ſumma. This use first appeared publicly in his paper De Geometria, published in Acta Eruditorum of June 1686,[2] but he had been using it in private manuscripts at least since 1675.[3]