Friday, December 14, 2012

12-plus pluralizations

kw: words, grammar

At a time I can no longer recall, I developed a love for words, as words. One fruit of this is a collection of lists of words by parts of speech. Not only so, I have classified those words that change with use. In this post I will get on record the pluralization of nouns, and the various patterns used. The table shows my analysis of 31,619 nouns. An explanation follows.

  • Adding an "s": three-fifths of nouns work this way. It is the first spelling rule most children learn. Dog → dogs and cloud → clouds and shoe → shoes and monkey → monkeys work this way.
  • Nouns that have no plural, or are their own plural: This plus the "add s" group make up 89% of all nouns. If you think of "water" as two words, one of them adds s, the other has no plural. That is, "waters" as a reference to numerous bodies of water is a +s plural. "Water" as a substance has no plural. "Lore" is another example; it is a kind of virtual substance also. Chemical terms (listed separately below) are also not pluralized: A word like "oxygens" is meaningless except to chemists using shorthand for "oxygen atoms" in a chemical structure.
  • Changing final "y" to "ies": This is for -y words that have a consonant before the "y", such as curry → curries and ruby → rubies. I find it interesting that these outnumber the -es plurals.
  • Adding "es": In singular form most of these nouns end in a sibilant sound. Examples are wish → wishes and paradox → paradoxes. A smaller number end in "o": potato → potatoes, although an increasing number of writers—currently 0.1%—add only "s".
  • Changing final "man" to "men": The 205 words in this category are all compounds with -men, such as "spokesman" or "workman". Many of these are being superseded by words ending in "person".
  • Irregular plurals: The 141 I have collected are a pretty comprehensive list. It includes child → children and beau → beaux and foot → feet. The "x" ones are from French, and the others came to English from a variety of languages.
  • Changing final "um" to "a": This is one of those famous Latin rules. Certain common words are now almost unknown in singular form, such as agendum → agenda and datum → data. Few people know that "agenda" and "data" are plural. More people know cranium → crania and spectrum → spectra, though you do find "craniums" and "spectrums" in print these days.
  • Changing final "us" to "i": Who doesn't know octopus → octopi? This is the best known Latin rule. Other examples are bacillus → bacilli and cactus → cacti and stimulus → stimuli.
  • Adding a final "e": Another Latin rule, only for nouns with female gender in Latin, such as larva → larvae and nebula → nebulae.
  • Changing final "is" to "es": A more obscure Latin rule. Examples are crisis → crises and analysis → analyses. The knowledge of oasis → oases and neurosis → neuroses persists mainly because other attempts to form a plural are so clumsy.
  • Changing final "f" or "fe" to "ves": An example of the first is wolf → wolves; of the second, life → lives.
  • Changing final "is" or "ex" to "ices": Examples of the first case are helix → helices and matrix → matrices; of the second, index → indices and codex → codices. However, a lot of "add es after the x" is being done, and these forms are likely to die out.
  • The last two cases, with a total of five words among them, ought actually to be included in the irregular verbs. Pluralizing the rare noun "do" to "do's" is probably not done any more, while using "go's" as a noun plural for "go" (think, "Let's have a go at it" – could you logically pluralize that?) is so rare I have seen it only once (and not in my own writing!). Finally, the consonant doubling rule applies only to quiz → quizzes and whiz → whizzes.
If you didn't know I was a nerd before, you know it now! I know only a couple of people who know more than five of these rules. Hardly anybody cares. Lists of such facts are primarily useful for people making software to recognize and process "natural language", particularly if they need to parse older texts.

No comments: