I recently happened to stumble across the lab page of a faculty member who used a word cloud based on his research papers to display the key words that come up most often in his own research. I thought this was a rather interesting way to get an objective view of ones research so started playing with the idea for my own work. Unfortunately, this turned out to be more difficult than I expected, not because building a word cloud was hard (there are plenty of available tools for that), but because extracting the text from the PDFs of all of my publications led to a lot of weird biases and errors (this is a PDF issue) and it’ll take a lot more effort to dig up the original raw text documents than I’m willing to go through right now.
It occurred to me secondarily, however, that we could use the same approach on the fiddler name data to get a visualization of how often each name appears in the literature.
First we have the occurrences of binomial/compound names in the literature. The frequencies are based on the number of publications each name appears in if used as a valid name (thus, a paper which states that name A is a junior synonym of name B would only count the senior synonym B and not the junior synonym A). No matter how many times the name is used within the paper, it counts as only one occurrence with respect to this exercise.
The results are pretty much what one would expect, but it does provide a somewhat interesting (if not particularly statistical) rendering of the relative name uses.
As with other parts of the site, we can do the same thing with the specific names only (ignoring both genera and lumping alternate/misspellings). Again, the major names are what one would expect.
I may need to play with the visualization a bit (color schemes, shape, etc.), but these images will be added to the name summary part of the website on the next monthly release.