Vowel Systems
This resource is for anyone who wants to make a realistic vowel system. There are a couple of overviews elsewhere on the interwobz- I think WeepingElf has one?- but as far as I've seen them they're relics of an internet past. (I'm thinking in particular of a page with no Unicode that had to write ɨ as "i-" and had a garish pink background- ah, others have informed me it's run by whatever Bricka goes by now.) I'll run through it by numbers of vowels, like those other pages, and point out interesting deviations.
Edit: in the interests of attribution, much of this was from this page by an unknown author, Wikipedia, or Bricka's page on vowel systems.
I am stealing a classification system I found on one of those pages. It is as follows:
T indicates a triangular vowel system, by far the most common.
- S indicates a square vowel system.
- V indicates a vertical vowel system.
- C indicates a "cubic" vowel system.
This is followed by the number of vowels, and:
- C indicates that the system mostly has central vowels as its "extras".
- R indicates that the system mostly has front rounded vowels as its "extras".
- U indicates that the system mostly has back unrounded vowels as its "extras".
- F means it has extra front unrounded vowels as its "extras".
- B means it has extra back rounded vowels as its "extras."
- L means it has extra laxed vowels as its "extras".
This is by no means hard and fast, but it is useful.
For the sake of easy analysis, I'll not analyze pure length, tone or nasalization, just vowel quality. I will also not include diphthongs.
One Vowel
There are few modern languages with just one vowel, with the possibility of Nuxalk (see below) and some Chadic languages. One or two restructions of Proto-Indo-European have just /e/, but this is somewhat untenable, given that all of its daughters have at least three vowels (I think), many of its daughters have at least four or five, and you'd be hard-pressed to explain ablaut that way.
You could write the phonemic status of a conlang with just one vowel as probably just about anything you want, although it would probably have a most common vowel of a centralized vowel like [ə].
1
ə
1a
The Salish language Nuxálk has the system T3 (see the section on three vowels) below, but can be analyzed as having just /a/, with /i u/ analyzed as syllabic /j w/, since /n m/ also have syllabic forms. This would be 1a:
a
Two Vowels
This too is rare.
S2
Some other reconstructions of PIE have given it two vowels, /e o/:
e o
V2
Ndu, Chadic, Arrandic
Several Northwest Caucasian languages have two central vowels, one low, one mid. An example is Ubykh, which also holds the world record for most consonants outside of a click language:
ə
a
The claim that Ubykh has only these two vowels is however a bit dubious, as a much wider range of vowels appears phonetically, influenced by the surrounding consonants. There are also several analyses of Mandarin Chinese that analyze it as having a V2 system, with the extra vowel phones coming from sequences of vowel and approximant. The Australian language Arrernte also has a V2 system, and the Ndu languages of New Guinea are rumored to have a V2 system as well. (The Ndu claim is particularly suspect; although they can all theoretically be analyzed as having a V2 system, at least one linguist has analyzed the Ndu language Iatmül as having twelve phonemic vowels. This is par for the course with strange little vowel systems.)
Because the only thing differentiating the two phonemes in a V2 system is height, they tend to have very wide ranges of allophones. In Arrernte, for example, /ə/ has the range [ɪ ~ e ~ ə ~ ʊ], with little regard for context.
Three Vowels
This is where we really get started; almost all languages have at least three vowels.
T3
Usually, this is the T3 system, as seen in Quechua, Inuktitut, Classical Arabic, most Australian languages, and Aleut:
i u
a
T3b
A few languages have T3b, the same as above, but with no high vowels. An example is the /a e o/ system of Yanesha' (also known as Amuesha) and Cheyenne:
e o
a
T3c
There is a final variation T3c, too, where /u/ is lowered. It's found in Pirahã and the short vowel system of Ojibwe:
i
o
a
Presumably something out there is best analyzed as having /ə i u/ (a theoretical T3d), with no low vowels, but I don't know of one.
V3
Beyond this, there is the system V3, which is found in the Sepik family (of which Ndu is a subfamily) of New Guinea:
ɨ
ə
a
Like V2 systems, allophony will be rampant, and you're likely to have many more phonetic vowels than just these three. Some researchers analyze Irish as basically having a V3 system in its short vowels:
ɪ~ʊ
ɛ~ɔ
a
V3F
A variant of V3, V3F, is found in the Caddoan language Wichita:
i
ɛ
a
In other words, you have a high front vowel, a mid front vowel, and a front to back low vowel. Again, allophony is rampant.
Four Vowels
Here, we start to see that a language can have a triangular system, or a square one. The triangular-central-square analysis starts to lose its accuracy after about six vowels, but it's still useful for classificatory purposes.
T4
Most languages with four vowels have some variant of T4, of course, since most languages have triangular systems. Usually this is T3 with some sort of addition:
Central Alaskan Yup'ik (in the short vowels only) and the Taiwanese language Rukai have T4C, which we'll refer to as just T4:
i u
ə
a
/i ɨ u a/ (*T4C) can be found in Yimas, and there are apparently some analyses of Rukai as having /ɪ ɨ u a/, which is similar. /i u ə a/ fills out the space pretty well, which is important- vowels, like gases, tend to spread out to fill their container (the mouth space) so that they're maximally distinct.
S4
There is also the vowel system /i e a u/, which can be analyzed as being the square
i u
e a
Or as a triangular variation T4F:
i u
e
a
T4Fb
In T4Fb, /u/ is lowered to /o/. It's common in North America: Nahuatl, Navajo, Proto-Algonquian and a slew of its descendents all have T4Fb:
i
e o
a
Regardless, S4 is found in Akkadian, Hittite (...maybe), Malagasy, and Proto-Slavic. I wouldn't be surprised if there were a language with a true S4 system /i u æ ɑ/, but I don't know of one. I also believe there is at least one language with /i u e o/, IE S4 with no low vowels at all, but I can't remember what it is.
V4
Finally, there is the V4 system, found only in the bizarre Marshallese language of Micronesia:
ɨ
ɘ
ɜ
a
The problem, as I said, is that Marshallese is batshit, and no fewer than twelve phonemic vowels will pop out of the woodwork if you only look for minimal pairs. The proof of the pudding is that, under such an analysis, Marshallese's glides /j w/ will have a very uneven distribution, and that eight of these vowels are best analyzed as a central vowel with a glide tacked on.
V4 is basically the largest vertical vowel system that you'd ever see in nature. Past this point, the square-vertical-triangular split begins to become less and less useful.
Five Vowels
T5
Here we find the most common vowel system, T5, found in Classical Latin, Modern Greek, Spanish, Hebrew, Japanese, Swahili, the Polynesian languages, and Basque:
i u
e o
a
T5C
There are a few variants on this. Lokono Arawak, spoken in Suriname, is rumored to have T5C:
i ɨ u
e
a
T5B
The mirror image of T5C, with /o/ instead of /e/, was the vowel system of Proto-Uto-Aztecan, and is still found in some of its daughters. We'll call it T5B:
i ɨ u
o
a
S5
And S5 is the vowel system of the Vanuatuan language Big Nambas:
i u
ə
e a
Six Vowels
Most of the systems I could find were just T5 with an extra vowel.
T6C
T6C, with /ə/ added, is found in Nepali and Armenian, as well as Southern Welsh (in a pinch):
i u
e ə o
a
T6Cb
T6Cb adds /ɨ/ instead, and is found in many Slavic languages, as well as Guaraní and Comanche.
i ɨ u
e o
a
S6
T6F adds /æ/ or /ɛ/, and is found in Chamorro, Menominee, Persian, Forest Nenets and pre-umlaut Old English. You could refer to it as S6 if you wanted, even, and that's what we'll do.
i u
e o
æ a
In S6, /a/ is not uncommonly /ɑ/. I don't know of any language which adds /ɔ/ or /ɑ/ instead of /ɛ/ or /æ/, and there probably isn't one, since having significantly more back vowels than front vowels like that is almost unheard of.
T6R
Common among conlangers, I think, is T6R:
i y u
e o
a
S6R
I don't know of any languages where this appears in nature; however, the rather strange S6R appears in the Uto-Aztecan language Hopi, thus being the only North American languge to my knowledge to natively possess a front rounded vowel of any sort:
i ɨ
ø o
ɛ a
T6Rc
Also bizarre is the system T6Rc, which is the system in the Chapacuran language Wari', spoken along the Bolivian-Brazilian border:
i y
e ø o
a
Seven Vowels
Almost all systems past this point will be using T5 as a base.
T7L
In plain T7L, usually, /ɛ ɔ/ is added. This is the vowel system of Vulgar Latin, Italian, Bengali, Brazilian Portuguese, and Yoruba:
i u
e o
ɛ ɔ
a
T7R
T7R, found in Hungarian, adds front rounded vowels instead:
i y u
e ø o
a
T7C
In T7C, found in Northern Welsh, Kashmiri, and Romanian, the additions are central vowels:
i ɨ u
e ə o
a
With /y/ instead of /ɨ/, we get T7Cb, the vowel system of Albanian.
T7Rc
T7Rc was found in Occitan, after a chain shift of VL ɔ -> o -> u -> y:
i y u
e o
ɛ
a
It was also, with /æ/ instead of /ɛ/, the vowel system of West Saxon Old English (S7R).
T7Cb
Finally, there is T7Cb, found in Amharic:
i u
e o
ɛ ə
a
Eight Vowels
Many of these are extended versions of T7L.
T8C
In T8C, found in Javanese, Catalan, São Tomean Creole, Lo-Toga (with fronting of /u/ to /ʉ/) and Slovene, /ə/ is added:
i u
e o
ɛ ə ɔ
a
Presumably there is a language which adds /ɨ/ instead of /ə/, but I don't know of one.
T8F
Here is T8F, found in Finnish:
i y u
e ø o
æ ɑ
Note that Finnish's system is a product of vowel harmony; a word can either have /y ø æ/, or it can have /u o ɑ/, but (except in compounds) it can't have both. Without the vowel harmony, this was also the system of some dialects of Old English.
T8B
A variant is T8B, Legion's dialect of French (disregarding nasalization) (thanks, Legion):
i y u
e ø o
ɔ
a
T8R
T8R is the usual analysis- er, or a more usual analysis- of Mandarin:
i y u
ɪ
e ə o
a
C8
Here, in C8, found in Igbo, we get our first "cubic" vowel system, which is just a square vowel system with an additional variable, like laxness or roundedness.
i u
ɪ ʊ
e o
a ɔ
This is the result of vowel harmony: /ɪ ʊ ɔ/ are one set of vowels, /a e i o u/ another.
C8R
Turkish's system
i y ɯ u
e ø a o
If it looks like I've shoehorned it into a cube when it shouldn't be there, that's not true; Turkish morphology fits the vowels neatly like this.
S8L
My own dialect of American English would probably be S8L:
i u
ɪ ʊ
ɛ ʌ
æ ɑ
Nine vowels
By now we're well past the usual number of vowels for natural languages. The systems will start getting increasingly more baroque, but also much less common.
T9L
T9L is found in Maasai; it's T7L with lax variants of /i u/:
i u
ɪ ʊ
e o
ɛ ɔ
a
S9C
S9C adds /ɨ ə/ to T7L instead. It's found in European Portuguese and Thai:
i ɨ u
e ə o
ɛ ɔ
a
Some analyses of European Portuguese have /ə ɐ/ instead of /ɨ ə/, however, which would be T9C instead. With /ɯ ɤ/ instead of /ɨ ə/ [S9U] we have Lao.
S9R
S9R is- in a pinch- standard French (kudos to Legion):
i y u
e ø o
ɛ ɔ
a
S9Rb
A variety of S9R is found in Southern Sami:
i y ɨ ʉ u
e o
ɛ ɑ
S9L
S9L is standard American English:
i u
ɪ ʊ
ɛ ʌ ɔ
æ ɑ
T9F
T9F is the vowel system of Estonian and Meadow Mari; it takes Finnish and adds /ɤ/. (Estonian has lost vowel length, but /ɤ/ is a retention from Proto-Finnic, I believe.)
i y u
e ø ɤ o
æ ɑ
T9Fb
T9Fb adds /ʉ/ instead of /ɤ/, and is Swedish without vowel length or reduction to schwa factored in. You could put in /ɨ/ instead, however.
i y ʉ u
e ø o
ɛ ɑ
T9Fc
T9Fc is found in Korean:
i ɯ u
e ø o
ɛ ʌ
a
Past this point, really, your imagination's the limit; many large vowel systems will have weird outliers in an otherwise symmetrical or ordered system, like baby gamma in Estonian. But we'll keep going...
Ten Vowels
T10L
T10L is found in Hindi and Panjabi, and I believe several African languages with vowel harmony, where /ɪ ʊ ɛ ɔ a/ alternate with /i u e o ə/:
i u
ɪ ʊ
e o
ɛ ə ɔ
a
T10R
T10R is found in Breton, adding front rounded vowels to T10L:
i y u
e ø o
ɛ œ ɔ
a
S10C
S10C, found in Khmer, adds central vowels to a variant of C8:
i ɨ u
e o
ɛ ə ɔ
a ɑ
S10R
S10R is found in Skolt Sami:
i u
e ɘ o
ɛ ɐ ɔ
a ɑ
Eleven Vowels
Past this point, almost everything you'll see is from Northwest Europe.
T11R
T11R takes T10R and adds a back variant to /a/. It's found in a language of Vanuatu called Sakao:
i y u
e ø o
ɛ œ ɔ
a ɑ
T11Rb
Switch out the low vowels and you get T11Rb, standard Danish (sort of; Danish phonology is a complete cluster****):
i y u
e ø o
ɛ œ ə ɔ
a
T11C
T11C is found in Vietnamese:
i u
ɪ ʊ
e ɘ o
ɛ ɐ ɔ
a
Twelve Vowels
Nothing major here, except for Selkup. At this point categorization starts to become an exercise in extremely iffy pedentry, so I'll stop:
Selkup:
i y ɨ u
ɪ
e ø ɘ o
ɛ ɔ
æ a
Received Pronounciation:
i u
ɪ ʊ
ə
ɛ ɜ ɔ
æ ʌ ɑ ɒ
More than Thirteen Vowels
Everything on this list is a Germanic language.
Dutch:
i y u
ɪ ʏ o
e ø ə ɔ
ɛ
a ɑ
Danish again, according to Routledge:
i y u
e ø ə o
ɛ œ ɐ ɔ
a ɑ
German:
i y u
ɪ ʏ ʊ
e ø ə o
ɛ œ ɐ ɔ
a
Swedish is probably the all-time record-keeper for number of phonemic qualities, although Danish probably has it beat on phonetics. By this analysis we have 16 vowel qualities:
i y ʉ u
ɪ ʏ ʊ
e ø ɵ o
ɛ œ ɔ
a ɑ
Well that's great and all, but how do I make a realistic vowel system?
Don't necessarily just copy a vowel system from this list.
You've probably noticed by now that certain patterns tend to recur. In particular, there are a lot of "base" vowel systems (/i a u/ (T3), /i e a o u/ (T5), and /i e ɛ a ɔ o u/ (T7L) in particular) to which languages add one or two "outliers". French and Hungarian, for example, add /y ø/ to T5 and T7L respectively.
Often you'll want to take a vowel system and add an extra dimension to it, like throwing in some central vowels, or a lax set of vowels.
For small vowel systems, most of the possible bases are covered in the overview- there simply aren't very many options. As I noted, vowels are kind of like a gas; they usually spread out to fill the vowel space very well. As a result, having a handful of vowels that are relatively close together is really only an option once you've already filled the space; a vowel system like /a ɑ ə/ is basically impossible, as is /i ɨ u/. (However, if your vowel system is very small, the vowels often will centralize a bit. Modern Quechua, for example, has /i a u/, but before the Spanish arrived it was more like /ɪ æ ʊ/.) Throwing in a random vowel often is justified when you're just one more than a "standard" system- I can't find a language that has /i y e a o u/, for example, but it wouldn't surprise me in the least if there was one, and it's certainly fair game for your conlangs.
"Filling the available space" is generally a good strategy in larger systems as well. E.g. I'd question the realism of a vowel system:
i ɨ ʉ u
e ə ɵ o
a
because distinctions between rounded and unrounded vowels simply aren't as audible towards the center of the board. (But note Southern Sami under the section about languages with nine vowels.) I'd expect this to shift very quickly to:
i y ɨ u
e ø ə o
a
which does fill out the available space pretty well.
As your vowel system gets larger, you'll start to run out of places to put your vowels, and will generally want to play with things like roundedness. Germanic languages are so large in part because they have roundedness distinctions in a lot of vowels.
As it gets larger, too, it will get easier to throw in random vowels- there's nothing very symmetrical-looking about English, for example. It will also get a lot more unstable. Diphthongization is a classic way to deal with this; indeed it's one of the reasons English sounds so distinctive- it cleared off several vowels by making them diphthongs. Even so, there are universals that will pretty much be followed at any size: you're not going to have very many more back than front vowels (so /i a u o/ is- probably- a no-no; but then it's just Proto-Uto-Aztecan without /ɨ/, so that's what you get for following universals too heavily); vertical vowel systems aside, vowels generally like to spread out to the margins (so a system like /i ɨ u ə ɐ a/ is very unlikely); a system tends to be higher than it is wide (so a system /i ɨ u e ə o/ is also probably not possible). There are other universals about vowel systems too, I'm sure, but the relevant PDF seems to have 404ed in the mists of time.
You can often create a distinctive-looking vowel system by taking a more boring one and putting in some sound changes. Occitan, for example, underwent a chain shift in the Vulgar Latin back vowels. The English Great Vowel Shift is another example. (Or take a distinctive vowel system and make it boring...Modern Greek's /i e a o u/ is the descendent of /i y e ɛ a ɔ o/.)