I before E, ESPECIALLY after C

by Dave Mozealous on December 24, 2009

Every American elementary schooler is taught the mnemonic device, “I before E, except after C.”  I am dumbfounded by how bad this rule is. I misspell words all the time. Words like policiesspecies, and caffeine I can never spell correctly because they violate the rule. The rule is littered with exceptions. Recently, I wondered how bad the rule actually is, how many words in the English language that contain either (either is an exception to the rule) combination of ie or ei represent exceptions to the rule? Well I set out solve this problem.  The data proves that the I before E rule is broken.

Why?

So what’s the point? Why have I spent so much time worrying about such a pointless rule? Well, I became interested in the data when I kept misspelling words based on my understanding of the rule. I then googled for the data, figuring someone must have analyzed the data. Much to my dismay, this information is not available.  As far as I can tell, this study is the first of its kind.

This is what I wanted to find out:

  • How many words contain the combination ie or ei?
  • How many exceptions are there to the rule?
  • What is the percentage of words that are exceptions to the rule?

How I gathered the data

I used a dictionary webservice to gather the data. Obviously thumbing through a Merriam-Webster dictionary to gather the data would be a monumental waste of time. Using a webservice will allow me to do this research programatically, and allow me to expose the data to anyone who wants the data.

The webservice I used is DictService, which allows you to use several different English dictionaries. For this article I used the The Collaborative International Dictionary of English.

History and facts about the I before E rule

Nobody knows who created the rule.  The earliest written use of it was found in a footnote in James Stuart Laurie’s Manual of English Spelling which was written in 1866.  It is safe to say that the rule has been used in English classrooms across the world for at least the past 140 years. There are a couple different variations to the rule, such as:

i before e except after c or when sounded like a as in neighbor and weigh

This addition makes the rule a bit better, but I am not going to focus on that variation, as there are still plenty of exceptions to that rule as well (specieseither, etc).

The rule is so bad, and has so many exceptions that in June of 2009 the British government advised against teaching the rule. Some funny info about the debates that this caused in England can be found here. The best quote came from Michael Grove in arguing against abandoning the rule:

Having systematically lowered school standards for a decade, it is sadly no surprise that the Government is now actively telling teachers not to bother trying to teach children how to spell properly.

The word that violates the rule the worst is the word “oneiromancies,” which is the plural form of oneiromancy (meaning divination by means of dreams). The word breaks the rule twice, in both ways.

Of the 100 most commonly used words in the English language, only one word contains a combination of either ei or ie, that word is “their,” which violates the rule.

Much of this history comes from the Wikipedia topic on the subject.

Analyzing the data

Now let’s take a look at the data. There are a total of 5223 words in the English language that contain either the letters ei or ie. Let’s first examine the “except after C” part of the rule.

Except after C

There are 107 words in the English language that contain the letters c and are followed directly by the letters eiFull list of the words here.

However, there are 212 words in the English language that contain the letters c and are followed directly by the letters ie, which violates the rule. Full list of the words here.

So in practice, if you did not know how to spell a word, you would be correct a higher percentage of the time (66%) by spelling the word cie instead of cei as the rule suggests you should.

One could argue that a better rule would be “I before E, especially after C.”

I before E

Ok, now let’s look at the first part of the rule. I should come before E, except after C.

There are 3885 words in the English language that contain the letters ieFull list of the words here.

There are 1338 words in the English language that contain the letters eiFull list of the words here.

So that gives us a total of 5223 words in the English language that contain some combination of ei or ie. From that list of 5223 words, 1338 of them contain the letters ei, but as we know, if the letters ei are preceded by the letter c, it doesn’t violate the rule, and as I mentioned earlier, 107 words contain the letters cei. So if we take our list of 1338 words, where the letter e comes before the letter i, but is NOT preceded by c, then we have a total of 1231 words that violate the rule. This equates to over %25 of all words that contain ei or ie are in violation of the rule.

Conclusions

Out of all the words in the English language, 25% of the words that contain an ei or ie combination violate the rule.

Words the contain the letter c and are immediately followed by a combination of the letters ei or ie are especially bad for violating the rule, as %66 percent of the words violate the rule.

So with numbers so strongly against the rule, why do we still continue to teach this? If the rule “look both ways before you cross the street” was wrong %25 of the time, would we teach it?

{ 10 comments… read them below or add one }

Matthew Bibby December 24, 2009 at 7:21 pm

Dave, you are my new hero …

Adam Schwartz December 25, 2009 at 2:18 pm

Wow Dave! That’s some analysis.

Since you’re clearly obsessed with the correct order of things, you’d probably appreciate to know that the percent sign comes *after* the digits.

mozealou December 25, 2009 at 7:25 pm

Ha, thanks Matthew!

@Adam, good point, someone should create a crafty mnemonic device to help me remember. Also, while I am investigating these grammatical rules I should also figure out when it is and isn’t appropriate to switch between past/present/future tense since I changed tenses about 15 times in this post.

Andrew Scivally January 4, 2010 at 4:46 pm

Love it! Now I need to somehow stop saying that rule in my head every time I try to spell those words.

Bruce Graham February 2, 2010 at 10:30 am

In the UK, there used to be a fuller version of this that ended with “…whenever the sound is “eee””.
Having read the article, I am unsure whether this helps or hinders, because my brain is now bleeding from trying to comprehand the analysis – however, I thought I’d just add it into the mix anyhow.

Bruce Graham

bobby April 22, 2010 at 1:21 pm

if you notice the list of words that violate the rule after a “c”, most of those words are plural versions of words, such as “aristocracies.” I think these words do follow the “rules” just not the one you’re so worked up about :]… but yeah english is a pretty messed up language.

Suzanne Karl-Brigman May 10, 2010 at 3:35 pm

English spelling would make a lot more sense if people also studied the history of the language. Look up spelling and pronunciation changes from the 1400s to 1600s.

Ben Riddell December 1, 2010 at 12:16 pm

Hi Dave,

You’re missing 1/2 of the rule. “I before E, except after C, and except when said ‘ai’ as in ‘neighbor’ or ‘weigh’.” So inveigh, feint, skein, freight, and others don’t violate the rule as fully expressed.

I still have found over a dozen violations (such as either, eider, eidetic, gneiss, deity, deist, epideictic, sleight, and a few others). English is one of the most complicated languages, orthographically, because it is so eclectic. Infuriating, if you let it be so, but it can also entertaining.

SnugglePuma April 25, 2011 at 7:13 pm

bobby is right, and you are right too, they should change the rule to “i before e, except after c, and if you are adding an s to a word ending in y.” Now go through and crunch numbers again and lets see if there really are more exceptions than the rule…

Jay May 24, 2011 at 8:05 pm

After removing the plurals of words ending in in “cy,” if you add the additional caveat of words with the suffix “cient (cience)” you knock out the vast majority of the remaining words. And all of those words count as a single exception. Similarly, all words using the root scient, scient, and science are from sciens, the present participle stem of the Latin verb scire (“to know”) and therefore ALL constitute a single exception. Most of the words remaining have foreign (non-English and therefore exempt from the rule) origins. So with merely 3 additional broad exceptions to add to the rule, and especially if somewhere in the body of the rule we add a phrase like “except for weird words like weigh,” we can still have a workable rule.

Leave a Comment

{ 3 trackbacks }

Previous post:

Next post: