In the rap community, it is a point of pride for who is the best wordsmith in a technical sense. As Busta bragged way back in the day, “Vocabulary’s necessary / when digging into my library.” The programmer Matt Daniels has exhaustively described this in detail in his own article here, where he calculated the vocabulary size for dozens of different rappers. Having been inspired by his own approach, I’ve built off his mathematical idea of the size of a rapper’s unique vocabulary and tried to logically extend it to measure a rapper’s skill in a meaningful, direct way.
This post is not to point out what Daniels missed, or ways that his search could be improved. His investigation set out to measure a rapper’s vocabulary size, and did it both admirably and expertly. As Daniels writes:
“In short, take all of this shit with a grain of salt. Think of this as a data-point that sparks interesting discussion about hip hop and word-usage, and absolutely not a conclusive argument for rapper x is better than rapper y.”
2. Their average word length in characters;
3. Their unique vocabulary size.
I’ve also included Wale, because he’s right around the average for all 193 of these artists. Chief Keef is there because he’s at the very low end of the spectrum.
We can see through comparison that this list is different from Daniels’ own, of who has the biggest vocabulary. For instance, Nas is no longer inside the top 20, at 17. (Instead, he’s fallen to number 46.) Meanwhile, Ma$e, who doesn’t appear on Daniels’ graph at all, is now number 20.
Why is there this difference, and why do I think it matters?
To answer this, let’s look at how a rapper who appears highly in Daniels’ own article ended up in my own analysis. For example, Nas, according to Daniels’ chart, has a unique vocabulary of 5,096 words. Being very familiar with this NYC MC, that seems to largely match up with what I would have in mind for him. However, I would say that the quality of Nas’ 5,000 or so words are very different from the quality of Wu-Tang’s own body of 5,895 words. The Staten Island Clan, dealing as they do with subjects that border on the occult and its initiate, have a much wider range of expressive material, both musically and poetically, available to them. This is in contrast to a once-gangsta rapper like Nas, whose eye level, even having transformed between Illmatic and Life Is Good, has never been far from the cloistered street life.
The qualities that separate Nas from Wu-Tang is just how complex the words in each of their respective vocabularies is. I’m taking “complex” here to largely refer to the length of a word in terms of characters, and the semantic meaning of that word. I consider those two qualities — length and definition — to be related, since, to a very, very, short, certain point, the longer a word is, the more complex it is. (This can only remain at the level of induction by enumeration, but consider any 2 syllable word: am, is, it, lo, etc. Those words are short, and exist for almost purely syntactical reasons. Meanwhile, the longest, most reasonable word to be found in dictionaries, deinstitutionalisation, has a definition consisting of 26 words.) It is this gap between vocab size and true mastery of words, then, that what I’m calling a “wordiness” metric tries to account for.
And voila! There he is at the top of the list.
Oh, and if you liked this article, you might enjoy these other ones, which are among my most popular:
1.) An analysis of Nas’ flow on the 2006 Busta Rhymes song “Don’t Get Carried Away,” which you can read here.
2.) My album review & analysis of the 2012 Kendrick Lamar album “good kid, m.A.A.d city,” which you can read here.
3.) A database of who the 23 most repetitive rappers in the industry are, available here.
4.) A study of every instrument Dr. Dre used on his songs between the years 2000 and 2009, online here.
5.) A breakdown of Eminem’s song “Business,” which you can check out here.