The field of articulatory phonetics is a subfield of phonetics that studies articulation and ways that humans produce speech. Articulatory phoneticians explain how humans produce speech sounds via the interaction of different physiological structures. Generally, articulatory phonetics is concerned with the transformation of aerodynamic energy into acoustic energy. Aerodynamic energy refers to the airflow through the vocal tract. Its potential form is air pressure; its kinetic form is the actual dynamic airflow. Acoustic energy is variation in the air pressure that can be represented as sound waves, which are then perceived by the human auditory system as sound.
|Part of a series on|
|Part of the Linguistics Series|
Sound is produced simply by expelling air from the lungs. However, to vary the sound quality in a way useful for speaking, two speech organs normally move towards each other to contact each other to create an obstruction that shapes the air in a particular fashion. The point of maximum obstruction is called the place of articulation, and the way the obstruction forms and releases is the manner of articulation. For example, when making a p sound, the lips come together tightly, blocking the air momentarily and causing a buildup of air pressure. The lips then release suddenly, causing a burst of sound. The place of articulation of this sound is therefore called bilabial, and the manner is called stop (also known as a plosive).
The vocal tract can be viewed through an aerodynamic-biomechanic model that includes three main components:
- air cavities
- air valves
Air cavities are containers of air molecules of specific volumes and masses. The main air cavities present in the articulatory system are the supraglottal cavity and the subglottal cavity. They are so-named because the glottis, the openable space between the vocal folds internal to the larynx, separates the two cavities. The supraglottal cavity or the orinasal cavity is divided into an oral subcavity (the cavity from the glottis to the lips excluding the nasal cavity) and a nasal subcavity (the cavity from the velopharyngeal port, which can be closed by raising the velum). The subglottal cavity consists of the trachea and the lungs. The atmosphere external to the articulatory stem may also be considered an air cavity whose potential connecting points with respect to the body are the nostrils and the lips.
Pistons are initiators. The term initiator refers to the fact that they are used to initiate a change in the volumes of air cavities, and, by Boyle's Law, the corresponding air pressure of the cavity. The term initiation refers to the change. Since changes in air pressures between connected cavities lead to airflow between the cavities, initiation is also referred to as an airstream mechanism. The three pistons present in the articulatory system are the larynx, the tongue body, and the physiological structures used to manipulate lung volume (in particular, the floor and the walls of the chest). The lung pistons are used to initiate a pulmonic airstream (found in all human languages). The larynx is used to initiate the glottalic airstream mechanism by changing the volume of the supraglottal and subglottal cavities via vertical movement of the larynx (with a closed glottis). Ejectives and implosives are made with this airstream mechanism. The tongue body creates a velaric airstream by changing the pressure within the oral cavity: the tongue body changes the mouth subcavity. Click consonants use the velaric airstream mechanism. Pistons are controlled by various muscles.
Valves regulate airflow between cavities. Airflow occurs when an air valve is open and there is a pressure difference between the connecting cavities. When an air valve is closed, there is no airflow. The air valves are the vocal folds (the glottis), which regulate between the supraglottal and subglottal cavities, the velopharyngeal port, which regulates between the oral and nasal cavities, the tongue, which regulates between the oral cavity and the atmosphere, and the lips, which also regulate between the oral cavity and the atmosphere. Like the pistons, the air valves are also controlled by various muscles.
To produce any kind of sound, there must be movement of air. To produce sounds that people can interpret as spoken words, the movement of air must pass through the vocal cords, up through the throat and, into the mouth or nose to then leave the body. Different sounds are formed by different positions of the mouth—or, as linguists call it, "the oral cavity" (to distinguish it from the nasal cavity).
Consonants are speech sounds that are articulated with a complete or partial closure of the vocal tract. They are generally produced by the modification of an airstream exhaled from the lungs. The respiratory organs used to create and modify airflow are divided into three regions: the vocal tract (supralaryngeal), the larynx, and the subglottal system. The airstream can be either egressive (out of the vocal tract) or ingressive (into the vocal tract). In pulmonic sounds, the airstream is produced by the lungs in the subglottal system and passes through the larynx and vocal tract. Glottalic sounds use an airstream created by movements of the larynx without airflow from the lungs. Click consonants are articulated through the rarefaction of air using the tongue, followed by releasing the forward closure of the tongue.
Place of articulation
Consonants are pronounced in the vocal tract, usually in the mouth. In order to describe the place of articulation, the active and passive articulator need to be known. In most cases, the active articulators are the lips and tongue. The passive articulator is the surface on which the constriction is created. Constrictions made by the lips are called labials. Constrictions can be made in several parts of the vocal tract, broadly classified into coronal, dorsal and radical places of articulation. Coronal articulations are made with the front of the tongue, dorsal articulations are made with the back of the tongue, and radical articulations are made in the pharynx. These divisions are not sufficient for distinguishing and describing all speech sounds. For example, in English the sounds [s] and [ʃ] are both coronal, but they are produced in different places of the mouth. To account for this, more detailed places of articulation are needed based upon the area of the mouth in which the constriction occurs.
Articulations involving the lips can be made in three different ways: with both lips (bilabial), with one lip and the teeth (labiodental), and with the tongue and the upper lip (linguolabial). Depending on the definition used, some or all of these kinds of articulations may be categorized into the class of labial articulations. Ladefoged and Maddieson (1996) propose that linguolabial articulations be considered coronals rather than labials, but make clear this grouping, like all groupings of articulations, is equivocal and not cleanly divided. Linguolabials are included in this section as labials given their use of the lips as a place of articulation.
Bilabial consonants are made with both lips. In producing these sounds the lower lip moves farthest to meet the upper lip, which also moves down slightly, though in some cases the force from air moving through the aperture (opening between the lips) may cause the lips to separate faster than they can come together. Unlike most other articulations, both articulators are made from soft tissue, and so bilabial stops are more likely to be produced with incomplete closures than articulations involving hard surfaces like the teeth or palate. Bilabial stops are also unusual in that an articulator in the upper section of the vocal tract actively moves downwards, as the upper lip shows some active downward movement.
Labiodental consonants are made by the lower lip rising to the upper teeth. Labiodental consonants are most often fricatives while labiodental nasals are also typologically common. There is debate as to whether true labiodental plosives occur in any natural language, though a number of languages are reported to have labiodental plosives including Zulu, Tonga, and Shubi. Labiodental affricates are reported in Tsonga which would require the stop portion of the affricate to be a labiodental stop, though Ladefoged and Maddieson (1996) raise the possibility that labiodental affricates involve a bilabial closure like "pf" in German. Unlike plosives and affricates, labiodental nasals are common across languages.
Linguolabial consonants are made with the blade of the tongue approaching or contacting the upper lip. Like in bilabial articulations, the upper lip moves slightly towards the more active articulator. Articulations in this group do not have their own symbols in the International Phonetic Alphabet, rather, they are formed by combining an apical symbol with a diacritic implicitly placing them in the coronal category. They exist in a number of languages indigenous to Vanuatu such as Tangoa, though early descriptions referred to them as apical-labial consonants. The name "linguolabial" was suggested by Floyd Lounsbury given that they are produced with the blade rather than the tip of the tongue.
Coronal consonants are made with the tip or blade of the tongue and, because of the agility of the front of the tongue, represent a variety not only in place but in the posture of the tongue. The coronal places of articulation represent the areas of the mouth where the tongue contacts or makes a constriction, and include dental, alveolar, and post-alveolar locations. Tongue postures using the tip of the tongue can be apical if using the top of the tongue tip, laminal if made with the blade of the tongue, or sub-apical if the tongue tip is curled back and the bottom of the tongue is used. Coronals are unique as a group in that every manner of articulation is attested. Australian languages are well known for the large number of coronal contrasts exhibited within and across languages in the region.
Dental consonants are made with the tip or blade of the tongue and the upper teeth. They are divided into two groups based upon the part of the tongue used to produce them: apical dental consonants are produced with the tongue tip touching the teeth; interdental consonants are produced with the blade of the tongue as the tip of the tongue sticks out in front of the teeth. No language is known to use both contrastively though they may exist allophonically.
Alveolar consonants are made with the tip or blade of the tongue at the alveolar ridge just behind the teeth and can similarly be apical or laminal.
Crosslinguistically, dental consonants and alveolar consonants are frequently contrasted leading to a number of generalizations of crosslinguistic patterns. The different places of articulation tend to also be contrasted in the part of the tongue used to produce them: most languages with dental stops have laminal dentals, while languages with apical stops usually have apical stops. Languages rarely have two consonants in the same place with a contrast in laminality, though Taa (ǃXóõ) is a counterexample to this pattern. If a language has only one of a dental stop or an alveolar stop, it will usually be laminal if it is a dental stop, and the stop will usually be apical if it is an alveolar stop, though for example Temne and Bulgarian do not follow this pattern. If a language has both an apical and laminal stop, then the laminal stop is more likely to be affricated like in Isoko, though Dahalo show the opposite pattern with alveolar stops being more affricated.
Retroflex consonants have several different definitions depending on whether the position of the tongue or the position on the roof of the mouth is given prominence. In general, they represent a group of articulations in which the tip of the tongue is curled upwards to some degree. In this way, retroflex articulations can occur in several different locations on the roof of the mouth including alveolar, post-alveolar, and palatal regions. If the underside of the tongue tip makes contact with the roof of the mouth, it is sub-apical though apical post-alveolar sounds are also described as retroflex. Typical examples of sub-apical retroflex stops are commonly found in Dravidian languages, and in some languages indigenous to the southwest United States the contrastive difference between dental and alveolar stops is a slight retroflexion of the alveolar stop. Acoustically, retroflexion tends to affect the higher formants.
Articulations taking place just behind the alveolar ridge, known as post-alveolar consonants, have been referred to using a number of different terms. Apical post-alveolar consonants are often called retroflex, while laminal articulations are sometimes called palato-alveolar; in the Australianist literature, these laminal stops are often described as 'palatal' though they are produced further forward than the palate region typically described as palatal. Because of individual anatomical variation, the precise articulation of palato-alveolar stops (and coronals in general) can vary widely within a speech community.
Dorsal consonants are those consonants made using the tongue body rather than the tip or blade.
Palatal consonants are made using the tongue body against the hard palate on the roof of the mouth. They are frequently contrasted with velar or uvular consonants, though it is rare for a language to contrast all three simultaneously, with Jaqaru as a possible example of a three-way contrast.
Velar consonants are made using the tongue body against the velum. They are incredibly common cross-linguistically; almost all languages have a velar stop. Because both velars and vowels are made using the tongue body, they are highly affected by coarticulation with vowels and can be produced as far forward as the hard palate or as far back as the uvula. These variations are typically divided into front, central, and back velars in parallel with the vowel space. They can be hard to distinguish phonetically from palatal consonants, though are produced slightly behind the area of prototypical palatal consonants.
Uvular consonants are made by the tongue body contacting or approaching the uvula. They are rare, occurring in an estimated 19 percent of languages, and large regions of the Americas and Africa have no languages with uvular consonants. In languages with uvular consonants, stops are most frequent followed by continuants (including nasals).
Radical consonants either use the root of the tongue or the epiglottis during production.
Pharyngeal consonants are made by retracting the root of the tongue far enough to touch the wall of the pharynx. Due to production difficulties, only fricatives and approximants can produced this way.
Epiglottal consonants are made with the epiglottis and the back wall of the pharynx. Epiglottal stops have been recorded in Dahalo. Voiced epiglottal consonants are not deemed possible due to the cavity between the glottis and epiglottis being too small to permit voicing.
Glottal consonants are those produced using the vocal folds in the larynx. Because the vocal folds are the source of phonation and below the oro-nasal vocal tract, a number of glottal consonants are impossible such as a voiced glottal stop. Three glottal consonants are possible, a voiceless glottal stop and two glottal fricatives, and all are attested in natural languages.
Glottal stops, produced by closing the vocal folds, are notably common in the world's languages. While many languages use them to demarcate phrase boundaries, some languages like Huatla Mazatec have them as contrastive phonemes. Additionally, glottal stops can be realized as laryngealization of the following vowel in this language. Glottal stops, especially between vowels, do usually not form a complete closure. True glottal stops normally occur only when they're geminated.
Manner of articulation
Knowing the place of articulation is not enough to fully describe a consonant, the way in which the stricture happens is equally important. Manners of articulation describe how exactly the active articulator modifies, narrows or closes off the vocal tract.
Stops (also referred to as plosives) are consonants where the airstream is completely obstructed. Pressure builds up in the mouth during the stricture, which is then released as a small burst of sound when the articulators move apart. The velum is raised so that air cannot flow through the nasal cavity. If the velum is lowered and allows for air to flow through the nose, the result in a nasal stop. However, phoneticians almost always refer to nasal stops as just "nasals".Affricates are a sequence of stops followed by a fricative in the same place.
Fricatives are consonants where the airstream is made turbulent by partially, but not completely, obstructing part of the vocal tract. Sibilants are a special type of fricative where the turbulent airstream is directed towards the teeth, creating a high-pitched hissing sound.
Laterals are consonants in which the airstream is obstructed along the center of the vocal tract, allowing the airstream to flow freely on one or both sides. Laterals have also been defined as consonants in which the tongue is contracted in such a way that the airstream is greater around the sides than over the center of the tongue. The first definition does not allow for air to flow over the tongue.
Trills are consonants in which the tongue or lips are set in motion by the airstream. The stricture is formed in such a way that the airstream causes a repeating pattern of opening and closing of the soft articulator(s). Apical trills typically consist of two or three periods of vibration.
Taps and flaps are single, rapid, usually apical gestures where the tongue is thrown against the roof of the mouth, comparable to a very rapid stop. These terms are sometimes used interchangeably, but some phoneticians make a distinction. In a tap, the tongue contacts the roof in a single motion whereas in a flap the tongue moves tangentially to the roof of the mouth, striking it in passing.
During a glottalic airstream mechanism, the glottis is closed, trapping a body of air. This allows for the remaining air in the vocal tract to be moved separately. An upward movement of the closed glottis will move this air out, resulting in it an ejective consonant. Alternatively, the glottis can lower, sucking more air into the mouth, which results in an implosive consonant.
Clicks are stops in which tongue movement causes air to be sucked in the mouth, this is referred to as a velaric airstream. During the click, the air becomes rarefied between two articulatory closures, producing a loud 'click' sound when the anterior closure is released. The release of the anterior closure is referred to as the click influx. The release of the posterior closure, which can be velar or uvular, is the click efflux. Clicks are used in several African language families, such as the Khoisan and Bantu languages.
Vowels are produced by the passage of air through the larynx and the vocal tract. Most vowels are voiced (i.e. the vocal folds are vibrating). Except in some marginal cases, the vocal tract is open, so that the airstream is able to escape without generating fricative noise.
Variation in vowel quality is produced by means of the following articulatory structures:
The glottis is the opening between the vocal folds located in the larynx. Its position creates different vibration patterns to distinguish voiced and voiceless sounds. In addition, the pitch of the vowel is changed by altering the frequency of vibration of the vocal folds. In some languages there are contrasts among vowels with different phonation types.
The pharynx is the region of the vocal tract below the velum and above the larynx. Vowels may be made pharyngealized (also epiglottalized, sphincteric or strident) by means of a retraction of the tongue root.:306–310 Vowels may also be articulated with advanced tongue root.:298 There is discussion of whether this vowel feature (ATR) is different from the Tense/Lax distinction in vowels.:302–6
The velum—or soft palate—controls airflow through the nasal cavity. Nasals and nasalized sounds are produced by lowering the velum and allowing air to escape through the nose. Vowels are normally produced with the soft palate raised so that no air escapes through the nose. However, vowels may be nasalized as a result of lowering the soft palate. Many languages use nasalization contrastively.:298–300
The tongue is a highly flexible organ that is capable of being moved in many different ways. For vowel articulation the principal variations are vowel height and the dimension of backness and frontness. A less common variation in vowel quality can be produced by a change in the shape of the front of the tongue, resulting in a rhotic or rhotacized vowel.
The lips play a major role in vowel articulation. It is generally believed that two major variables are in effect: lip-rounding (or labialization) and lip protrusion.
What the above equations express is that given an initial pressure P1 and volume V1 at time 1 the product of these two values will be equal to the product of the pressure P2 and volume V2 at a later time 2. This means that if there is an increase in the volume of cavity, there will be a corresponding decrease in pressure of that same cavity, and vice versa. In other words, volume and pressure are inversely proportional (or negatively correlated) to each other. As applied to a description of the subglottal cavity, when the lung pistons contract the lungs, the volume of the subglottal cavity decreases while the subglottal air pressure increases. Conversely, if the lungs are expanded, the pressure decreases.
A situation can be considered where (1) the vocal fold valve is closed separating the supraglottal cavity from the subglottal cavity, (2) the mouth is open and, therefore, supraglottal air pressure is equal to atmospheric pressure, and (3) the lungs are contracted resulting in a subglottal pressure that has increased to a pressure that is greater than atmospheric pressure. If the vocal fold valve is subsequently opened, the previously two separate cavities become one unified cavity although the cavities will still be aerodynamically isolated because the glottic valve between them is relatively small and constrictive. Pascal's Law states that the pressure within a system must be equal throughout the system. When the subglottal pressure is greater than supraglottal pressure, there is a pressure inequality in the unified cavity. Since pressure is a force applied to a surface area by definition and a force is the product of mass and acceleration according to Newton's Second Law of Motion, the pressure inequality will be resolved by having part of the mass in air molecules found in the subglottal cavity move to the supraglottal cavity. This movement of mass is airflow. The airflow will continue until a pressure equilibrium is reached. Similarly, in an ejective consonant with a glottalic airstream mechanism, the lips or the tongue (i.e., the buccal or lingual valve) are initially closed and the closed glottis (the laryngeal piston) is raised decreasing the oral cavity volume behind the valve closure and increasing the pressure compared to the volume and pressure at a resting state. When the closed valve is opened, airflow will result from the cavity behind the initial closure outward until intraoral pressure is equal to atmospheric pressure. That is, air will flow from a cavity of higher pressure to a cavity of lower pressure until the equilibrium point; the pressure as potential energy is, thus, converted into airflow as kinetic energy.
Sound sources refer to the conversion of aerodynamic energy into acoustic energy. There are two main types of sound sources in the articulatory system: periodic (or more precisely semi-periodic) and aperiodic. A periodic sound source is vocal fold vibration produced at the glottis found in vowels and voiced consonants. A less common periodic sound source is the vibration of an oral articulator like the tongue found in alveolar trills. Aperiodic sound sources are the turbulent noise of fricative consonants and the short-noise burst of plosive releases produced in the oral cavity.
Voicing is a common period sound source in spoken language and is related to how closely the vocal cords are placed together. In English there are only two possibilities, voiced and unvoiced. Voicing is caused by the vocal cords held close by each other, so that air passing through them makes them vibrate. All normally spoken vowels are voiced, as are all other sonorants except h, as well as some of the remaining sounds (b, d, g, v, z, zh, j, and the th sound in this). All the rest are voiceless sounds, with the vocal cords held far enough apart that there is no vibration; however, there is still a certain amount of audible friction, as in the sound h. Voiceless sounds are not very prominent unless there is some turbulence, as in the stops, fricatives, and affricates; this is why sonorants in general only occur voiced. The exception is during whispering, when all sounds pronounced are voiceless.
- Non-vocal fold vibration: 20–40 hertz (cycles per second)
- Vocal fold vibration
- Lower limit: 70–80 Hz modal (bass), 30–40 Hz creaky
- Upper limit: 1170 Hz (soprano)
Vocal fold vibration
- cricoid cartilage
- thyroid cartilage
- arytenoid cartilage
- interarytenoid muscles (fold adduction)
- posterior cricoarytenoid muscle (fold abduction)
- lateral cricoarytenoid muscle (fold shortening/stiffening)
- thyroarytenoid muscle (medial compression/fold stiffening, internal to folds)
- cricothyroid muscle (fold lengthening)
- hyoid bone
- sternothyroid muscle (lowers thyroid)
- sternohyoid muscle (lowers hyoid)
- stylohyoid muscle (raises hyoid)
- digastric muscle (raises hyoid)
- Magnetic resonance imaging (MRI) / Real-time MRI
- Medical ultrasonography
- Electromagnetic articulography
In order to understand how sounds are made, experimental procedures are often adopted. Palatography is one of the oldest instrumental phonetic techniques used to record data regarding articulators. In traditional, static palatography, a speaker's palate is coated with a dark powder. The speaker then produces a word, usually with a single consonant. The tongue wipes away some of the powder at the place of articulation. The experimenter can then use a mirror to photograph the entire upper surface of the speaker's mouth. This photograph, in which the place of articulation can be seen as the area where the powder has been removed, is called a palatogram.
Technology has since made possible electropalatography (or EPG). In order to collect EPG data, the speaker is fitted with a special prosthetic palate, which contains a number of electrodes. The way in which the electrodes are "contacted" by the tongue during speech provides phoneticians with important information, such as how much of the palate is contacted in different speech sounds, or which regions of the palate are contacted, or what the duration of the contact is.
- Note that although sound is just air pressure variations, the variations must be at a high enough rate to be perceived as sound. If the variation is too slow, it will be inaudible.
- Ladefoged 2001, p. 5.
- Ladefoged & Maddieson 1996, p. 9.
- Ladefoged & Maddieson 1996, p. 16.
- Ladefoged & Maddieson 1996, p. 43.
- Maddieson 1993.
- Fujimura 1961.
- Ladefoged & Maddieson 1996, pp. 16–17.
- Ladefoged & Maddieson 1996, pp. 17–18.
- Ladefoged & Maddieson 1996, p. 17.
- Doke 1926.
- Guthrie 1948, p. 61.
- Baumbach 1987.
- International Phonetic Association 2015.
- Ladefoged & Maddieson 1996, p. 18.
- Ladefoged & Maddieson 1996, pp. 19–31.
- Ladefoged & Maddieson 1996, p. 28.
- Ladefoged & Maddieson 1996, pp. 19–25.
- Ladefoged & Maddieson 1996, pp. 20, 40–1.
- Scatton 1984, p. 60.
- Ladefoged & Maddieson 1996, p. 23.
- Ladefoged & Maddieson 1996, pp. 23–5.
- Ladefoged & Maddieson 1996, pp. 25, 27–8.
- Ladefoged & Maddieson 1996, p. 27.
- Ladefoged & Maddieson 1996, pp. 27–8.
- Ladefoged & Maddieson 1996, p. 32.
- Ladefoged & Maddieson 1996, p. 35.
- Ladefoged & Maddieson 1996, pp. 33–34.
- Keating & Lahiri 1993, p. 89.
- Maddieson 2013.
- Ladefoged et al. 1996, p. 11.
- Lodge 2009, p. 33.
- Ladefoged & Maddieson 1996, p. 37.
- Ladefoged & Maddieson, p. 37. sfn error: no target: CITEREFLadefogedMaddieson (help)
- Ladefoged & Maddieson 1996, p. 38.
- Ladefoged & Maddieson 1996, p. 74.
- Ladefoged & Maddieson 1996, p. 75.
- Ladefoged & Johnson 2011, p. 14.
- Ladefoged & Johnson 2011, p. 67.
- Ladefoged & Maddieson 1996, p. 145.
- Ladefoged & Johnson 2011, p. 15.
- Ladefoged & Maddieson 1996, p. 102.
- Ladefoged & Maddieson 1996, p. 182.
- Ladefoged & Johnson 2011, p. 175.
- Ladefoged & Maddieson 1996, p. 217.
- Ladefoged & Maddieson 1996, p. 218.
- Ladefoged & Maddieson 1996, p. 230-231.
- Ladefoged & Johnson 2011, p. 137.
- Ladefoged & Maddieson 1996, p. 78.
- Ladefoged & Maddieson 1996, p. 246-247.
- "Laver, John Principles of Phonetics, 1994, Cambridge University Press
- "Peter Ladefoged and Ian Maddieson The Sounds of the World's Languages, 1996, Blackwell; ISBN 0-631-19815-6
- Stated in a less abbreviatory fashion: pressure1 × volume1 = pressure2 × volume2
- volume1 divided by sum of volume1 and change in volume = sum of pressure1 and the change in pressure divided by pressure1
- Niebergall, A; Zhang, S; Kunay, E; Keydana, G; Job, M; et al. (2010). "Real-time MRI of Speaking at a Resolution of 33 ms: Undersampled Radial FLASH with Nonlinear Inverse Reconstruction". Magn. Reson. Med. 69 (2): 477–485. doi:10.1002/mrm.24276. PMID 22498911. S2CID 21057863..
- Ladefoged, Peter (1993). A Course In Phonetics (3rd ed.). Harcourt Brace College Publishers. p. 60.
- Baumbach, E. J. M (1987). Analytical Tsonga Grammar. Pretoria: University of South Africa.
- Doke, Clement M (1926). The Phonetics of the Zulu Language. Bantu Studies. Johannesburg: Wiwatersrand University Press.
- Fujimura, Osamu (1961). "Bilabial stop and nasal consonants: A motion picture study and its acoustical implications". Journal of Speech and Hearing Research. 4 (3): 233–47. doi:10.1044/jshr.0403.233. PMID 13702471.
- Guthrie, Malcolm (1948). The classification of the Bantu languages. London: Oxford University Press.
- International Phonetic Association (1999). Handbook of the International Phonetic Association. Cambridge University Press.
- International Phonetic Association (2015). International Phonetic Alphabet. International Phonetic Association.
- Keating, Patricia; Lahiri, Aditi (1993). "Fronted Velars, Palatalized Velars, and Palatals". Phonetica. 50 (2): 73–101. doi:10.1159/000261928. PMID 8316582.
- Ladefoged, Peter (1960). "The Value of Phonetic Statements". Language. 36 (3): 387–96. doi:10.2307/410966. JSTOR 410966.
- Ladefoged, Peter (2001). A Course in Phonetics (4th ed.). Boston: Thomson/Wadsworth. ISBN 978-1-413-00688-9.
- Ladefoged, Peter (2005). A Course in Phonetics (5th ed.). Boston: Thomson/Wadsworth. ISBN 978-1-413-00688-9.
- Ladefoged, Peter; Johnson, Keith (2011). A Course in Phonetics (6th ed.). Wadsworth. ISBN 978-1-42823126-9.
- Ladefoged, Peter; Maddieson, Ian (1996). The Sounds of the World's Languages. Oxford: Blackwell. ISBN 978-0-631-19815-4.
- Lodge, Ken (2009). A Critical Introduction to Phonetics. New York: Continuum International Publishing Group. ISBN 978-0-8264-8873-2.
- Maddieson, Ian (1993). "Investigating Ewe articulations with electromagnetic articulography". Forschungberichte des Intituts für Phonetik und Sprachliche Kommunikation der Universität München. 31: 181–214.
- Maddieson, Ian (2013). "Uvular Consonants". In Dryer, Matthew S.; Haspelmath, Martin (eds.). The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology.
- Scatton, Ernest (1984). A reference grammar of modern Bulgarian. Slavica. ISBN 978-0893571238.
- Interactive place and manner of articulation
- Observing your articulators
- QMU's CASL Research Centre site for ultrasound tongue imaging
- Seeing Speech – with reference examples of IPA sounds using MRI and ultrasound tongue imaging
- UCLA Electromagnetic articulography
- UCLA Aerometry
- UCLA Electrolaryngography
- Interactive Flash website for American English, Spanish and German sounds