Phoneme
From Wikipedia, the free encyclopedia
In human language, a phoneme is the theoretical representation of a sound. It is a sound of a language as represented (or imagined) without reference to its position in a word or phrase. A phoneme, therefore, is the conception of a sound in the most neutral form possible and distinguishes between different words or morphemes — changing an element of a word from one phoneme to another produces either a different word or obvious nonsense.
Phonemes are not the physical segments themselves, but mental abstractions of them. A phoneme could be thought of as a family of related phones, called allophones, that the speakers of a language think of, and hear or see, as being categorically the same and differing only in the phonetic environment in which they occur.
In sign languages, the basic movements were formerly called cheremes (or cheiremes), but usage changed to phoneme when it was recognized that the mental abstractions involved are essentially the same as in oral languages.
A phonemically "perfect" alphabet is one that has a single symbol for each phoneme. See Phonemic orthography.
Although the concept has been fundamental to the development of phonological analysis of language beneath the level of the syllable, some linguists reject the theoretical validity of the phoneme. Some think that phonemes are more a product of literacy (i.e., the need to categorize the phonetics of a language in order to write it down systematically with a minimum number of letters). Other critics charge that the mind processes sub-phonemic elements of speech (e.g., features) in meaningful ways.
A common test to determine whether two phones are allophones or separate phonemes relies on finding so-called minimal pairs: words that differ only in the phones in question.
Contents |
[edit] Background and related ideas
In ancient India, the Sanskrit grammarian Pāṇini (c. 520–460 BC), in his text of Sanskrit grammar, the Shiva Sutras, originated the concepts of the phoneme, the morpheme and the root. The Shiva Sutras describes a phonemic notational system in the fourteen initial lines of the Aṣṭādhyāyī. The notational system introduces different clusters of phonemes that serve special roles in the morphology of Sanskrit, and are referred to throughout the text. Panini's grammar of Sanskrit may have had a significant influence on Ferdinand de Saussure, the father of modern structuralism, who was a professor of Sanskrit.
Around the 4th century BC–3rd century BC, the definitions of phoneme (oliyam) and alphabet (ezuththu) were written in the Tolkāppiyam, for the Tamil language. These definitions still survive as part of Tamil grammar.
The term phonème was reportedly first used by Dufriche-Desgenettes in 1873, but it referred to only a sound of speech. The term phoneme as an abstraction was developed by the Polish linguist Jan Niecislaw Baudouin de Courtenay and his student Mikołaj Kruszewski during 1875-1895. The term used by these two was fonema, the basic unit of what they called psychophonetics. The concept of the phoneme was elaborated in the works of Nikolai Trubetzkoi and other of the Prague School (during the years 1926-1935), as well as in that of structuralists like Ferdinand de Saussure, Edward Sapir, and Leonard Bloomfield. Later, it was also used in generative linguistics, most famously by Noam Chomsky and Morris Halle, and remains central in any accounts of the development of virtually all modern schools of phonology.
The phoneme can be defined as "the smallest meaningful psychological unit of sound." The phoneme has mental, physiological, and physical substance: our brains process the sounds; the sounds are produced by the human speech organs; and the sounds are physical entities that can be recorded and measured.
For an example of phonemes, consider the English words pat and sat, which appear to differ only in their initial consonants. This difference, known as contrastiveness or opposition, is sufficient to distinguish these words, and therefore the P and S sounds are said to be different phonemes in English. A pair of words that are identical except for such a sound are known as a minimal pair; this is the most frequent demonstration that two sounds are separate phonemes.
If no minimal pair can be found to demonstrate that two sounds are distinct, it may be that they are allophones. Allophones are variant phones (i.e., sounds) that are not recognized as distinct by a speaker, and are not meaningfully different in the language, and so are perceived as "the same". This is especially likely if they consistently occur in different environments. For example, the "dark" L sound at the end of the English word "wool" is quite different from the "light" L sound at the beginning of the word "leaf", but this difference is meaningless in English, and is determined by whether the sound is at the beginning or end of a word. A native English speaker might have a hard time hearing the difference at first, but in Turkish the difference between "light" and "dark" L is sufficient to distinguish words. That is, they are two separate phonemes in Turkish, but allophones of a single phoneme in English.
The phonemic relationship of two sounds may not be obvious to a non-native speaker, which is why minimal pairs and an understanding of phonetic environments are important. For example, in Korean, there is a phoneme /r/ that is a flapped r between vowels, and is an l-sound in other phonetic contexts. These sounds are very different to an English speaker, who is attuned to hearing them because the differences are meaningful in English. However, the native Korean speaker has learned from an early age to consider the two sounds the same. Thus, Korean speakers do not differentiate the two words "ram" and "lamb", despite the fact that both R and L sounds occur in the language.
Across multiple languages, the same IPA symbol may be used to represent a phoneme, but their actual pronunciation may not be identical but merely similar. For example, the Finnish word maat ("countries") sounds different from the British English (Received Pronunciation) word mart even though both are phonemically transcribed as IPA /mɑːt/[1]. Such distinctions can be made in a phonetic transcription.
The exact number of phonemes in English depends on the speaker and the method of determining phoneme vs. allophone, but estimates typically range from 40 to 45, which is above average across all languages. Pirahã has only 10, while !Xóõ has 141.
Depending on the language and the alphabet used, a phoneme may be written consistently with one letter; however there are many exceptions to this rule — see Writing systems below.
Some languages make use of pitch for phonemic distinction. In this case, the tones used are called tonemes. Some languages distinguish words made up of the same phonemes (and tonemes) by using different durations of some elements, which are called chronemes. However, not all scholars working on languages with distinctive duration use this term.
Usually, long vowels and consonants are represented either by a length indicator or doubling of the symbol in question.
In sign languages, phonemes may be classified as Tab (elements of location, from Latin tabula), Dez (the hand shape, from designator), Sig (the motion, from signation), and with some researchers, Ori (orientation). Facial expressions and mouthing are also phonemic.
[edit] Notation
A transcription that only indicates the different phonemes of a languages is said to be phonemic. Such transcriptions are enclosed within virgules (slashes), / /; these show that each enclosed symbol is claimed to be phonemically meaningful. On the other hand, a transcription that indicates finer detail, including allophonic variation like the two English L's, is said to be phonetic, and is enclosed in square brackets, [ ].
The common notation used in linguistics employs virgules (slashes) (/ /) around the symbol that stands for the phoneme. For example, the phoneme for the initial consonant sound in the word "phoneme" would be written as /f/. In other words, the graphemes are <ph>, but this digraph represents one sound /f/. Allophones, more phonetically specific descriptions of how a given phoneme might be commonly instantiated, are often denoted in linguistics by the use of diacritical or other marks added to the phoneme symbols and then placed in square brackets ([ ]) to differentiate them from the phoneme in slant brackets (/ /). The conventions of orthography are then kept separate from both phonemes and allophones by the use of angle brackets < > to enclose the spelling.
The symbols of the International Phonetic Alphabet (IPA) and extended sets adapted to a particular language are often used by linguists to write phonemes of oral languages, with the principle being one symbol equals one categorical sound. Due to problems displaying some symbols in the early days of the Internet, systems such as X-SAMPA and Kirshenbaum were developed to represent IPA symbols in plain text. As of 2004, any modern web browser can display IPA symbols (as long as the operating system provides the appropriate fonts), and we use this system in this article.
The only published set of phonemic symbols for a sign language is the Stokoe notation developed for American Sign Language, which has since been applied to British Sign Language by Kyle and Woll, and to Australian Aboriginal sign languages by Adam Kendon. However, there are several phonetic systems, such as SignWriting.
[edit] Examples
Examples of phonemes in the English language would include sounds from the set of English consonants, like /p/ and /b/. These two are most often written consistently with one letter for each sound. However, phonemes might not be so apparent in written English, such as when they are typically represented with combined letters, called digraphs, like <sh> (pronounced /ʃ/) or <ch> (pronounced /tʃ/).
To see a list of the phonemes in the English language, see IPA for English.
Two sounds that may be allophones (sound variants belonging to the same phoneme) in one language may belong to separate phonemes in another language or dialect. In English, for example, /p/ has aspirated and non-aspirated allophones:aspirated as in /pɪn/, and non-aspirated as in /spɪn/. However, in many languages (e. g. Chinese), aspirated /pʰ/ is a phoneme distinct from unaspirated /p/. As another example, there is no distinction between [r] and [l] in Japanese, there is only one /r/ phoneme in Japanese, although the Japanese /r/ has allophones that make it sound more like an /l/, /d/, or /r/ to English speakers. The sounds /z/ and /s/ are distinct phonemes in English, but allophones in Spanish. /n/ (as in run) and /ŋ/ (as in rung) are phonemes in English, but allophones in Italian and Spanish.
An important phoneme is the chroneme, a phonemically-relevant extension of the duration a consonant or vowel. Some languages or dialects such as Finnish or Japanese allow chronemes after both consonants and vowels. Others, like Italian or Australian English use it after only one (in the case of Italian, consonants; in the case of Australian, vowels).
[edit] Arguments against the phoneme
Rather than a basic mental unit of language, some think that the phoneme may well be a perceptual artifact of alphabetic literacy (see the terms Phonemic awareness and Phonological awareness). If not that, it may be an epiphenomenal aspect to listening removed from face-to-face encounters, that is, text-like listening (qv phone and feature). It could be said that the unit of the phoneme is a necessary construct if we wish to set a dynamic, complex spoken language into static, written form expressed at a sub-syllabic level, though the model is a simplification and nowhere near phonologically or phonetically complete. The phoneme has the theoretical weakness from the perspective of phonology in that it uses, in part, lexical criteria to determine something that is supposed to be phonological (i.e., minimal pairs of words to point out phonological categories).
Much of phonology, while accepting the phoneme as possible model or unit of language for description, has largely moved past the segmental phoneme as a basic unit of speech, of speech processing or of language acquisition. This is because the concept of the 'feature' is viewed as beneath the level of the phoneme while also spanning across segments. Meanwhile, attempts at capturing a phonological picture of the psychological control and structure underlying real speech flounder on the inadequacies of the phoneme for such purposes (that is, the phoneme can not account for co-articulation or assimilation of controlled speech, among other phenomena). Such an endeavor is more for the field of articulatory phonology, and its rival unit of phonology is the 'articulatory gesture'. However, the term 'phoneme', though variably defined and delimited, remains a widely and uncritically accepted concept in second and foreign language teaching and in the psychology of native literacy (especially for acquisitional literacy in alphabetic languages, such as English).
[edit] Restricted phonemes
A restricted phoneme is a phoneme that can only occur in a certain environment: There are restrictions as to where it can occur. English has several restricted phonemes:
- /ŋ/, as in sing, occurs only at the end of a syllable, never at the beginning (in many other languages, such as Swahili, /ŋ/ can appear word-initially).
- /h/ occurs only before vowels and at the beginning of a syllable, never at the end (a few languages, such as Arabic, allow /h/ syllable-finally).
- In many American dialects with the cot-caught merger, /ɔ/ occurs only before /r/, /l/, and in the diphthong /ɔɪ/.
- In non-rhotic dialects, /r/ can only occur before a vowel, never at the end of a word or before a consonant.
- Under most interpretations, /w/ and /j/ occur only before a vowel, never at the end of a syllable. However, many phonologists interpret a word like boy as either /bɔɪ/ or /bɔj/.
[edit] Neutralization, archiphoneme, underspecification
Phonemes that are contrastive in certain environments may not be contrastive in all environments. In the environments where they don't contrast, the contrast is said to be neutralized.
In English there are three nasal phonemes, /m, n, ŋ/, as shown by the minimal triplet,
/sʌm/ | sum | ||
/sʌn/ | sun | ||
/sʌŋ/ | sung |
However, with rare exceptions, these sounds are not contrastive before plosives such as /p, t, k/ within the same morpheme. Although all three phones appear before plosives, for example in limp, lint, link, only one of these may appear before each of the plosives. That is, the /m, n, ŋ/ distinction is neutralized before each of the plosives /p, t, k/:
- Only /m/ occurs before /p/,
- only /n/ before /t/, and
- only /ŋ/ before /k/.
Thus these phonemes are not contrastive in these environments, and according to some theorists, there is no evidence as to what the underlying representation might be. If we hypothesize that we are dealing with only a single underlying nasal, there is no reason to pick one of the three phonemes /m, n, ŋ/ over the other two.
(In some languages there is only one phonemic nasal anywhere, and due to obligatory assimilation, it surfaces as [m, n, ŋ] in just these environments, so this idea is not as far-fetched as it might seem at first glance.)
In certain schools of phonology, such a neutralized distinction is known as an archiphoneme (Nikolai Trubetzkoy of the Prague school is often associated with this analysis.). Archiphonemes are often notated with a capital letter. Following this convention, the neutralization of /m, n, ŋ/ before /p, t, k/ could be notated as |N|, and limp, lint, link would be represented as |lɪNp, lɪNt, lɪNk|. (The |pipes| indicate underlying representation.) Other ways this archiphoneme could be notated are |m-n-ŋ|, {m, n, ŋ}, or |n*|.
Another example from American English is the neutralization of the plosives /t, d/ following a stressed syllable. Phonetically, both are realized in this position as [ɾ], a voiced alveolar flap. This can be heard by comparing writer with rider (for the sake of simplicity, Canadian raising is not taken into account).
[ɻaɪˀt] | write | ||
[ɻaɪd] | ride |
with the suffix -er:
['ɻaɪɾɚ] | writer | |||
['ɻaɪɾɚ] | rider |
Thus, one cannot say whether the underlying representation of the intervocalic consonant in either word is /t/ or /d/ without looking at the unsuffixed form. This neutralization can be represented as an archiphoneme |D|, in which case the underlying representation of writer or rider would be |'ɻaɪDɚ|.
Another way to talk about archiphonemes involves the concept of underspecification. Phonemes can be considered fully specified segments while archiphonemes are underspecified segments. In Tuvan, phonemic vowels are specified with the features of tongue height, backness, and lip rounding. The archiphoneme |U| is an underspecified high vowel where only the tongue height is specified.
phoneme/ archiphoneme |
height | backness | roundedness | ||
---|---|---|---|---|---|
/i/ | high | front | unrounded | ||
/ɯ/ | high | back | unrounded | ||
/u/ | high | back | rounded | ||
|U| | high | - | - |
Whether |U| is pronounced as front or back and whether rounded or unrounded depends on vowel harmony. If |U| occurs following a front unrounded vowel, it will be pronounced as the phoneme /i/; if following a back unrounded vowel, it will be as an /ɯ/; and if following a back rounded vowel, it will be an /u/. This can been seen in the following words:
-|Um| | 'my' | (the vowel of this suffix is underspecified) | |||||
|idikUm| | → | [idikim] | 'my boot' | (/i/ is front & unrounded) | |||
|xarUm| | → | [xarɯm] | 'my snow' | (/a/ is back & unrounded) | |||
|nomUm| | → | [nomum] | 'my book' | (/o/ is back & rounded) |
Not all phonologists accept the concept of archiphonemes. Many doubt that it reflects how people process language or control speech, and some argue that archiphonemes add unnecessary complexity.
[edit] Non-phonemes
Prothesis, epenthesis and paragoge, due to phonotactics, add sounds into words without adding meaning. Nevertheless, the sound is added, and thus the phoneme status may be ambiguous. For example, in Spanish a prothetic e- must be added before initial /s/ + consonant clusters, e.g. estrés.
[edit] Phonological extremes
Of all the sounds that a human vocal tract can create, different languages vary considerably in the number of these sounds that are considered to be distinctive phonemes in the speech of that language. Ubyx and Arrernte have only two phonemic vowels, while at the other extreme, the Bantu language Ngwe has fourteen vowel qualities, twelve of which may occur long or short, for twenty-six oral vowels, plus six nasalized vowels, long and short, for thirty-eight vowels; while !Xóõ achieves thirty-one pure vowels—not counting vowel length, which it also has—by varying the phonation. Rotokas has only six consonants, while !Xóõ has somewhere in the neighborhood of seventy-seven, and Ubyx eighty-one. French has no phonemic tone or stress, while several of the Kam-Sui languages have nine tones, and one of the Kru languages, Wobe, has been claimed to have fourteen, though this is disputed. The total phonemic inventory in languages varies from as few as eleven in Rotokas to as many as 112 in !Xóõ (including four tones). These may range from familiar sounds like [t], [s], or [m] to very unusual ones produced in extraordinary ways (see: Click consonant, phonation, airstream mechanism). The English language itself uses a rather large set of thirteen to twenty-two vowels, including diphthongs, though its twenty-two to twenty-six consonants are close to average. (There are twenty-one consonant and five vowel letters in the English alphabet, but this does not correspond to the number of consonant and vowel sounds.)
The most common vowel system consists of the five vowels /i/, /e/, /a/, /o/, /u/. The most common consonants are /p/, /t/, /k/, /m/, /n/. Very few languages lack one of these: Arabic lacks /p/, standard Hawaiian lacks /t/, Mohawk lacks /p/ and /m/, Hupa lacks both /p/ and a simple /k/, colloquial Samoan lacks /t/ and /n/, while Rotokas and Quileute lack /m/ and /n/. While most of these languages have very small inventories, Quileute and Hupa have quite complex consonant systems.
[edit] Writing systems
At least in theory, in a phonemic writing system, a given symbol represents a single phoneme, and each phoneme is represented by a single symbol. This may differ from a phonetic orthography, which only requires that the spelling be unambiguously determined by the pronunciation, and the pronunciation unambiguously indicated by the spelling. Phonemic representation of a language is often described as 'broad transcription', while a phonetic rendering is called 'narrow'. A phonetic system would have more symbols or spelling conventions, since it might, in part, attempt to capture some key sound variations (allophones) of a phoneme. Learners of a foreign or second language can benefit from a more phonetic writing system if it reveals subtleties in pronunciation that are phonemically glossed over by literate native or fluent speakers of that language (since the latter's purpose is fluent reading).
English spelling (whether British, American or Australian) is often cited as the classic example of a nonphonemic, and indeed unphonetic, spelling system. In French, rules to predict pronunciation from spelling are quite simple and have few exceptions, as long as there are some clues such as context or part of speech, but guessing spelling from pronunciation is quite difficult, especially because of the many silent letters. Both written English and French (being lexical cousins, if quite different phonologically speaking) tend to preserve word root (over sound) relationships. Italian, Spanish and especially Finnish have a very close letter-to-phoneme correspondence. Karelian has a perfectly phonemic spelling system, although it has no standard language and writers tend to write in their own dialect.
The relationship that Arabic has with its writing system is the opposite of French, where a given pronunciation has an obvious spelling while the reverse is not generally true due to the tendency to omit diacritics that indicate short vowels(Harakat) except in special cases such us religious texts and children's literature or when a word's meaning is ambiguous and can not be discerned from the context.
Other languages fall somewhere in between polar distinctions such as "lexical vs. phonemic and/or phonetic" and "phonemic vs. phonetic". Although English is often given as an example of an unphonetic orthography, its system is nowhere near to being as logographic (lexical or word-based) a system as Chinese writing is. English spelling conveys etymological, derivational and inflectional information, but also vast amounts of phonetic information as well. In a nutshell, written English displays a great deal of complexity for representing vowel sounds within a fairly stable and consistent consonant framework (though there is a shortage of letters all around, with 26 letters and phoneme counts well over 40). Spanish is often given as an example of a phonetic orthography, but it has numerous imperfections including silent letters. It is, at least, possible to tell the correct pronunciation of any written Spanish word. Another phonemic orthography is Serbian. Its phonemicity was established by Serbian "Webster" Vuk Stefanović Karadžić. He followed a strict phonemic principle, which is best told by his own words: "Write as you speak and read as it is written.". Hindi, a descendant of Sanskrit, is an example of phonetically written language represented with a non-Roman Alphabet that is partly syllabic in nature. Hindi's writing system, however, probably ultimately descends from the same ancient Middle Eastern sources that gave the world the Roman, Cyrillic and Arabic scripts.
Real world distinctions between phonemic and nonphonemic orthographies are exaggerated. All languages are written with conventions that represent both meaning and pronunciation. This is true at both ends of the scale: Chinese characters are first and foremost symbols for morphemes and words, but they may have some phonetic elements to their composition as well (and these work, sometimes at least, the way spelling analogies do in written English). At the other extreme, there are a few orthographies which are complete and consistent phonemic representations of an artificial national standard. The phonemic principles by which orthographies might be standardized might also exclude representation of variations in pronunciation within the spoken dialects of a national language.
The Korean Hangul alphabet was used largely phonemically after its invention in the 15th century, but shifted to an almost perfectly morphophonemic writing system in more recent times. It is often noted as paying attention to phonetic detail and capturing the language analytically at a fine-tuned featural level, although modern orthography does not indicate vowel quantity (or pitch accent in some dialects), which until the 20th century was phonemically distinctive in the standard language. On the other hand, the orthography retains distinctions that have all but disappeared from many speakers' pronunciation – orthographies prior to 1933 did this even more than today. They, for example, distinguished between dy, jy and j before vowels on the basis of a word's etymology, even though all three were pronounced alike.
[edit] See also
- Minimal pair
- Phone
- Phonology
- Phonemic differentiation
- Emic and etic
- Tone (linguistics)
- Morphophonology
- List of phonetics topics
- Initial-stress-derived noun
- Viseme
- Free variation
[edit] External links
- What is a phoneme? (SIL)
- What is an allophone? (SIL)
- What is a phone? (SIL)
- What is a phonetically similar segment? (SIL)
- What is a minimal pair? (SIL)
- What is complementary distribution? (SIL)
- What is an environment? (SIL)
- What is an contrast in identical environments? (SIL)
- What is an contrast in analogous environments? (SIL)
- Comparison of morpheme-morph-allomorph & phoneme-phone-allophone? (SIL)
- What is phonology? (SIL)
- Phoneme (Lexicon of Linguistics)
- Allophony (Lexicon of Linguistics)
- Transcription (Lexicon of Linguistics)
- Grapheme-Phoneme Conversion (Lexicon of Linguistics)
- Phoneme Restoration (Lexicon of Linguistics)
- phonemic awareness