GeekCLE Science Monday is a regular feature exploring the relationship between science and geek culture.
Words are a funny thing. They are powerful enough to drag men to war or rebuild communities. They create far away worlds from nothing and help us define who we were, who we are and who we wish we could be. However, say the same word over and over and you have a phenomenon called “semantic satiation” in which the repeated word loses all meaning. It’s been said that the only universal language is mathematics and perhaps that’s true for those that are fluent in it.
Without a doubt numbers, whether they be data points, binary code or equations that attempt to capture how time and space interact, make it easier for us to communicate concepts, or at least that theory seems to hold up for the scientific community. Before there was statistical analysis of semantic relationships, there was Linnaeus’ binominal classification system, and before that the best classification systems relied on impossibly long polynomial Latin names. Before that was mostly just a hot mess with the Greeks and Romans trying to make sense of local dialects and traditions.
Take something as simple as the tomato. In Spanish and French that little round red fruit is called “tomate”, in Germany they call it “tomaten” and if you’re fluent in Swahili you call it “nyanya”. Imagine being a budding taxonomist: which word becomes the universal representation for this hardy little plant in your classification system? Sometime in the late sixteen century budding taxonomists decided that Latin would be the universal language of scholarship and scientific endeavors and decided to classify the tomato as Solanum caule inermi herbaceo, foliis pinnatis incisis or “the solanum with the smooth stem, which is herbaceous and has incised leaves” for short.
Linnaeus would later come along and blow minds with his binomial classification system, which shortened the scientific names of organisms to just genus and species, making our favorite red fruit Solanum lycopersicum. Luckily, modern classification systems have stuck to Linnaeus’ two-name system while debates continue to rage regarding in what categories these two-named organisms should be placed. So, what can science do when there is debate over the meaning of words like “geek” and “nerd”? How do we classify a geek versus a nerd? Create a scatterplot, of course.
And that’s exactly what software engineer, Burr Settles, did one Saturday. By definition, Settles described “geek” as being applicable to individuals who are “enthusiasts about a particular topic” and are “collection oriented”. In contrast, a “nerd” is an “intellectual” particularly enthusiastic about a single field or topic but much more “achievement oriented”. Curious to see if he could find support of his definitions of what constitutes a geek vs. a nerd, he culled Twitter looking for the semantic relationship between words. That old adage “guilt by association” or “you’re known by the company you keep” is what’s important here. If part of Settles definition of “geek” includes references to comic book collections or cosplay, hypothetically those words would more often be paired with “geek” as opposed to the word “nerd”. Using a pointwise mutual information (PMI) model, Settles found some pretty interesting correlations between word pairs, indicating that there could be some weight to his theory. For those who are not totally riveted by statistical models, the PMI is a way to show what one word means to another in a word pair. (For all the nitty-gritty details please read his witty and engaging post here: http://slackprop.wordpress.com/2013/06/03/on-geek-versus-nerd/).
So, long story short, Settles found some statistically significant data that suggests there may be something to his definitions of the words “geek” and “nerd”. That being said, one of the primary difficulties in any classification system is that life is really unbelievably messy at times. Living organisms rarely fall neatly into the classifications that sometimes hinge on completely arbitrary assignments. Geeks can be nerds, nerds can be geeks and there’s a whole spectrum in between. Borders are constantly being redrawn and it’s not uncommon for definitions to change. The word “geek” used to refer to sideshow freaks but several John Hughes movies later, the word was given new life. With every new piece of data, we get a glimpse of an evolution in process. And there is almost always room for argument.
So, what do you think? Does Settles’ data support his definition of what constitutes a geek?