The Higgins Home Page Homographs Homophones

Minimal pairs for English RP

by John Higgins

Updated June 2006. Note that to see the IPA phonetic characters correctly you must have the font Lucida Sans Unicode installed on your systems. This is a standard Windows font and is provided on most modern systems.

Contents

Introduction

Minimal pairs are pairs of words whose pronunciation differs at only one segment (vowel or consonant sound), such as sheep and ship, or lice and rice. They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.

Each cell in the tables below is a link to a list of minimal pairs derived from a dictionary. Use the tables of vowels and consonants below to retrieve the relevant lists, all of which are plain ASCII text files. Some are still in a rather raw state, while others have been edited and commented on. The first versions of the lists included only one pair for each pronunciation, such as heal/hole. Newly revised versions are being added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole, for instance, the new version of the list would include all of the following:

Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is in the list.

How minimal is minimal?

Although the normal definition of a minimal pair specifies that the words differ in one segment, it allows that segment to be widely different in terms of articulation. Another tighter definition of a minimal pair might be words which differ by only one feature. In that case the ideal minimal pair might be cheer versus jeer which differ only in voicing. These two words also belong to the same part of speech and so have the same inflections. Moreover they belong in the same domain of discourse, and are therefore highly confusable. If you were to overhear a fragment of conversation which included:

You should have heard them ??eering at the end of the game.
you would have to perceive the voicing in order to know exactly what was meant. Most minimal pairs are considerably more distinct than that one, and in many cases would cause no difficulty to any speaker. However there is a kind of delight in recognising some of the pairs, which I feel may be related to the enjoyment we feel when we come across an outrageous rhyme in a song or piece of verse.

They can also be the source of genuine confusions and disputes. A story which appeared in newspapers in April 1998 suggested that the urn known as The Ashes and presented to the winning team in the England versus Australia cricket series contains not the remains of a bail, as the traditional account stated, but of a veil. Another story, involving not strictly a minimal pair but a highly confusable pair of words, appeared in January 1997. It told how a Japanese tourist with a ticket for Turkey had gone to Paddington station in London and asked for directions. She was put on the train to Torquay (a seaside town in South West England). There are all sorts of confusable sentences which can easily lead to 'slips of the ear' among English speakers, such as "the Dutch are suspicious" being misheard as "the Duchess is vicious". The only siginificant difference in the sound of those two sentences is /p/ versus /v/ and this is one which is notoriously difficult for foreign learners and can lead to unexpected problems. On a recent trip to Spain I heard a Spanish guide leading a party of British tourists asking them to rendezvous at what they thought was "St Martin's Village" when he meant to say "St Martin's Bridge". The /b/ versus /v/ contrast is not made in Spanish, and his strong articulation of the /r/ made it easily confused with an /l/. Another slip of the ear I encountered recently was postcard for coastguard; although the initial /p versus /k/ distinction is a fairly strong one, the /k/ versus /g/ distinction in the middle of the word is neutralised by the presence of the /s/. (A contributory factor is that coastguards are often located in picturesque seaside towns, from which it would be reasonable to send a postcard.) A similar misunderstanding arose in a recent conversation between raingear and reindeer, and a Times column by Matthew Parris on 8 June 2006 made the following comment following a much-criticised police raid:

At first, on a crackly radio, my brother Roger thought he heard that in East London 250 officers had swooped on a terrorist house.
Second time round he heard it right: terraced, not "terrorist". Could police "intelligence" have made the same mistake?
Homophones engender many spelling mistakes; probably the commonest of all is to write there instead of their or vice versa. Sometimes a set of near homophones leads to a spelling error, as in a notice seen recently: RUGBY, STIRLING VERSES LIVERPOOL. In the same way minimal pairs engender many spelling errors in the writing of foreign learners. Among those I have seen recently are "a reach man" (rich man) and "a brought road" (broad road). An Arabic-speaking student once wrote an essay for me about a visit to London during which he had seen "the Pig Pen Watch" (Big Ben). Even national newspapers are not immune. The Times of Tuesday, September 5th, 2000, printed the following:
Apology
Readers will have been surprised yesterday to see the famous Cold War phrase "mutually assured destruction" (MAD) rendered as "neutrally assured destruction" (NAD). What began as a copytaking error somehow survived into this column. To anyone who was confused as well as to those who were not, we offer our apologies.

Source of the lists

Hal Gleason (An Introduction to Descriptive Linguistics, Holt Rinehart Winston, 1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."

I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list. The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.

The dictionary lists just over 70,000 words, corresponding to about 40,000 headwords. This may seem rather short, leaving out words which may enter minimal pairs, making the lists incomplete. That is not necessarily a disadvantage. Sometimes, as with a spelling checker, one does not want obscure words included. There exists, for instance, an English word flong. It is the name of a rubberised cardboard which used to form an intermediate stage of the printing of newspapers on a rotary press. I doubt whether one native-speaker in ten thousand knows the word; the only reason I do is that our next-door neighbour in my childhood worked as a printer for a national newspaper and used to give us discarded sheets of the stuff to insulate our hen-house, and I enjoyed trying to read the news stories on it in mirror-writing. But if the dictionary included the word flong, there would be new pairs fling/flong, flung/flong, flop/flong, flog/flong and flock/flong, which would be of little relevance to teachers or learners.

Semantic loading

When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on the part of speech of the words and therefore their potential for appearing in the same contexts. Two verbs, such as cheer and hear, are much more confusable than an adjective and a preposition, such as mere and near. For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 0.7. As a rough rule of thumb, the lower the semantic loading, the more confusable pairs exist for that contrast, since a large number of inflected forms signals a large number of words in the open classes: noun, verb or adjective.

There are a number of problems waiting to be resolved:

You will find two related lists derived from the same dictionary source at the following links:

One extension of the project is to see how the distribution of minimal pairs relates to the overall phoneme frequencies in the same dictionary. For each contrast I am in the process of calculating a figure to indicate the density of minimal pairs in the vocabulary as a whole, i.e. what the proportion of actual minimal pairs is to the number there might have been if every possible word in the dictionary was matched. This figure is normally no greater than 5%, and is often less than 2%.

Why did I start?

My personal interest in this topic may be due partly to the fact that I once lived in a flat in the village of Etiler near Istanbul. From the living room one had a view across a green meadow down towards the steep sides of the Bosphorus, where one constantly saw passing freighters, small cruise liners and even submarines. It was one of the few places in the world where one might have said "Look, there's a sheep!" and expect to be misunderstood.


Vowels and diphthongs

ɪ e æ ɑ ɒ ɔ ʊ u ʌ 3 ə ɔɪ əʊ ɪə ʊə null cons
i 471 338 394 316 362 489 82 381 301 309 66 561 532 98 527 157 133 144 38 170 64
ɪ
446 635 228 438 271 61 222 456 178 362 334 257 59 358 88 48 28 7 1348 978
e
302 142 227 212 43 130 233 147 36 250 222 57 213 118 32 30 11

æ
179 409 179 56 159 425 160 11 256 237 29 240 103 23 31 9

ɑ
172 156 34 75 172 127 11 184 125 37 169 51 46 48 22 61
ɒ
157 73 141 300 153 1 218 172 22 203 96 26 19 8 46
ɔ
56 142 168 180 21 251 207 71 243 106 82 92 23 88
ʊ
18 19 41 1 61 52 3 28 15 6 8 3

u
119 74 9 234 200 45 208 97 26 33 11

ʌ
126 4 211 148 29 181 85 18 20 7

3
8 182 141 33 149 63 35 41 14

ə
82 20 3 48 3 none 7 none


353 90 336 154 41 47 15


56 269 166 43 33 13

ɔɪ
75 33 14 7 6

əʊ
115 42 44 13


22 18 6

ɪə
67 22


19


ɪ e æ ɑ ɒ ɔ ʊ u ʌ 3 ə ɔɪ əʊ ɪə ʊə null cons


Consonants

b t d k g f v Ɵ ð s z ʃ ʒ h m n ŋ l r j w ʧ ʤ null vowel
p 612 882 524 1009 401 570 227 129 66 613 222 216 3 377 620 561 84 683 374 87 433 296 197 916 139
b
431 400 458 350 411 129 63 34 342 79 186 2 228 385 270 37 346 289 64 196 225 179 995
t
682 731 319 405 232 117 57 1258 379 247 8 231 453 517 109 575 318 46 216 238 248

d
466 250 332 285 126 58 481 2660 242 7 185 414 484 1619 507 440 39 142 206 208

k
341 464 176 112 42 472 214 213 4 272 413 460 87 470 229 50 193 211 155

g
196 79 52 18 201 54 145 1 125 239 240 61 207 155 26 109 97 108

f
130 50 35 371 73 137 2 185 312 236 22 272 218 49 178 156 171

v
25 30 204 148 49 2 66 187 222 83 233 112 7 52 63 93

Ɵ
9 91 59 41 2 36 60 67 10 65 37 10 42 42 36

ð
28 34 18 2 15 63 53 7 45 18 3 19 22 16

s
232 220 9 217 361 384 51 467 299 42 169 182 184

z
65 11 24 159 317 1135 253 50 8 17 102 94

ʃ
5 129 179 148 83 180 155 34 105 115 103

ʒ
none 9 6 none 6 none none 1 3 1

h
226 139 none 216 225 70 191 95 101

m
359 59 513 259 52 150 172 175

n
78 681 239 35 142 151 147

ŋ
58 2 none none 21 76

l
589 68 204 182 202

r
58 213 120 151

j
48 28 45

w
61 93

ʧ
92


b t d k g f v Ɵ ð s z ʃ ʒ h m n ŋ l r j w ʧ ʤ null vowel


Return to start

Comments and corrections should be sent to marlodge1--at--tiscali.co.uk (replace --at-- with @ when mailing).


Page last updated .