Complexity from Compression: A Sketch of Pre-Tangut

Marc Miyake

Tangut, a mediaeval Qiangic language (Sino-Tibetan family) has a distinction of three grades (děng 等). The traditional Sofronov-Gong reconstruction of this distinction supposes different degrees of medial yod: Grade I {-Ø-}, Grade II {-i-}, Grade III {-j-}. The yods, however, are not supported by the transcriptional evidence. Based on cognates between Tangut and Rgyalrongic languages, this study proposes the uvularization hypothesis: Tangut syllables have contrastive uvularization. Grade I/II syllables are uvularized, while Grade III syllables are plain. For phonological velars, uvularized syllables trigger a uvular allophone, while plain syllables trigger a velar allophone. Tangut uvularization is an instance of a common typological feature in Qiangic languages, that of guttural secondary vocalic articulations (GSVA), variously termed uvularization, velarization, tenseness or Retracted Tongue Root (RTR). Recognizing Tangut grades as a case of Qiangic GSVA has far-ranging potential consequences for Sino-Tibetan comparative linguistics.

Gyalrongic languages, a subgroup of the Burmo-Qiangic branch of the Sino-Tibetan family, are spoken in the Western Sichuan Province of China. They are polysynthetic languages, and present rich verbal morphology. Although they are not closely related to Chinese, they are of particular interest for Sino-Tibetan/Trans-Himalayan comparative linguistics with regards to their conservative phonology and morphology. Based on previous studies on Old Chinese phonology, combining with recent fieldwork data, this paper aims to show how Gyalrong languages could shed light on Old Chinese morphology and thus contribute to the Old Chinese reconstruction. It also proposes a list of possible cognates between Old Chinese, Gyalrong languages, indicating also Tibetan cognates when available.

There are more native speakers of Sino-Tibetan languages than of any other language family in the world. Our records of these languages are among the oldest for any human language, and the amount of active research on them, both diachronic and synchronic, has multiplied in the last decades. This volume covers the better-described languages, but with comments on the subgroups in which they occur. Ine addition to a number of modern languages, there ares on the descriptions of several ancient languages.

Miyake Marc Hideo Complexity from compression: a sketch of pre-Tangut 0. Introduction Nearly half a century ago, E. I. Kychanov and M. V. Sofronov co-authored Issledovanija po fonetike tangutskogo jazyka (1963), the first monograph with a systematic reconstruction of Tangut phonology. Several other reconstructions have appeared since then. All have distinct values for most, if not all, of the 105 rhymes of the 文海寶韻 Precious Rhymes of the Sea of Characters, a monolingual Tangut dictionary. These reconstructed values generally contain few final consonants and no final obstruents. G. Clauson was skeptical about such a rhyme system: “Sofronov’s (1963) list contains sixty-five open vowels <...> It does seem impossible that a Tangut phonetician, 1 however acute his hearing, could have distinguished sixty-five different open vowel sounds, even if some of these were in fact diphthongs”.2 His objections could also apply to later reconstructions. Nonetheless, there is no Chinese, Tibetan, or Sanskrit transcription evidence for a more elaborate set of final consonants in Tangut, so it is safest to continue reconstructing a large number of final vowels. How did such a large set of vocalic distinctions come into being? In this paper, I present a scenario in which pre-Tangut, a language with a relatively simple phonology, developed into Tangut, a language with a much more complicated phonology, through a process that I call ‘compression’. Due to space limitations, I cannot offer full arguments for my speculations, though I will mention parallels in other languages for various features and sound changes. 1. Pre-Tangut Pre-Tangut is the unattested, hypothetical ancestor of Tangut reconstructed on the basis of (1) phonological alternations in Tangut and (2) comparison with related languages. It is an intermediate stage between Proto-Tibeto-Burman and Tangut. 2. Word structure of pre-Tangut Many if not most words of pre-Tangut were sesquisyllables consisting of an unstressed presyllable followed by a stressed syllable. *presyllable (C)(V) + syllable (C)(G)(V& )(C)(H)3 This iambic structure is similar to the structure of Old Chinese as reconstructed by Sagart (1999). It is found today in the minor-major syllable sequences of Burmese and the unrelated Mon-Khmer languages. Perhaps it can be projected back to the ancestors of pre-Tangut: Proto-Tibeto-Burman or even as far back as Proto-Sino-Tibetan. 3. Pre-Tangut presyllables L. Sagart proposed that Old Chinese had two kinds of prefixes: fused prefixes that combined with root initials and iambic prefixes that were lost. 4 I reconstruct a similar distinction in pre-Tangut between three kinds of presyllables: 1 And, I would add, any Tangut native speaker. 2 Clauson 1964, p. 66. 3 I write asterisks before my pre-Tangut reconstructions. However, I do not write asterisks before my Tangut reconstructions because (1) all non-Tangut script representations of Tangut are reconstructions by definition and (2) the absence of asterisks helps to distinguish Tangut reconstructions from pre-Tangut reconstructions which I always write with asterisks. See appendixes 1 and 2 for lists of the initials and finals of my Tangut reconstruction. All reconstructions in this paper are mine unless explicitly stated otherwise. 4 Sagart 1999, pp. 17-18 1. Fused preinitials or presyllables that conditioned medial -w-, tense vowels, aspiration, and retroflexion (see 3.1) 2. Iambic presyllables that were lost before intervocalic lenition (see 3.2.1) 3. Iambic presyllables that were lost after intervocalic lenition (see 3.2.1) The unstressed vowels of all three types of presyllables may have conditioned the warping of the vowel of the stressed syllable before fusion or presyllabic loss (see 3.2.2). 3.1. Preinitial consonants Preinitial consonants could either be primary or secondary. Primary preinitials were never followed by unstressed vowels. In other words, they were never onsets of presyllables. Secondary preinitials were onsets of presyllables that lost their vowels: *presyllable CV- > *preinitial C- Preinitial consonants fused with the initial consonants of stressed syllables, resulting in Cw- clusters (3.1.1.1), tense consonants that in turn conditioned tense vowels before being lost (3.1.1.2), aspirates (3.1.1.3), and retroflexion (3.1.2.1). 3.1.1. Preinitial obstruents 3.1.1.1. Preinitial labials <bC> in Tibetan transcriptions of Tangut corresponds to Tangut Cw (Tai 2008). This may suggest that bC- had become Cw- in the native dialect(s) of the Tibetan transcribers of Tangut. The <bC> transcriptions could also be taken at face value as evidence for a Tangut dialect preserving an earlier preinitial labial obstruent *P-. If Tibetan <b> represented a real Tangut preinitial, then Tibetan medial <w> might have represented a Tangut ‘primary waw’ as opposed to a Tangut ‘secondary waw’ that developed from *P- in other dialect(s) such as the standard dialect codified in dictionaries. *-w- > primary waw -w- in all (?) Tangut dialects *P- > secondary waw -w- (except in the dialect(s) transcribed in Tibetan?) Tangut *zero ~medial -w- alternations5 originated as zero ~ *P-alternations: e.g., 慙 1dzi < *dzi ‘calm’ (adjective)6 擠 1dzwi < *P-dzi ‘to calm’ (verb) Nonalternating native Tangut medial -w- may be either primary or secondary. There is no guarantee that all *P-less cognates of *P-words survived in Tangut, so a medial -w-word without a medial -w-less counterpart may not necessarily have a primary waw: e.g., 假 2dzwio ‘person’ could be from *Cɯ-dzwoH with primary waw or from *Pɯ-dzoH whose presyllable conditioned a secondary waw.7 There are no Tangut words with labial initials followed by -w- (pw-, phw-, bw-, mw-, vw-). If pre-Tangut had *PP-sequences, they were simplified to P- in Tangut: e.g., *P-m- > *mw- > m-, etc. 5 Gong 1988, p. 798-800 6 English glosses of Tangut words are based on the glosses in Gong 1988 and Li 2008. 7 A presyllable with a high vowel is necessary to account for the warping of -o to -io. See 3.2.2.2. 3.1.1.2. Preinitial coronals Gong Hwang-cherng observed alternations between Tangut lax and tense vowels. 8 Gong (1999) then proposed that tense vowels (written here with subscript dots) originated from preinitial *s- on the basis of external comparisons: e.g., 芸 1təu ‘thousand’ : Written Tibetan stong ‘id.’ Since lax-tense vowel alternations in Tangut have multiple functions , 9 perhaps tense vowels originated from more than one voiceless coronal obstruent that I will symbolize as *S-. This *S- could either be part of the root or a prefix. I reconstruct it as a prefix if a tense vowel word has a lax vowel cognate within Tangut or has an *s-less external cognate: e.g., 吟 1khwa < *khwa ‘distant’ 店 1khwa < *s-khwa ‘to keep at a distance’ How could a consonant condition tension in a following but nonadjacent vowel? Modern Korean tense consonants (pp-, tt-, ss-, cc-, kk-) originated from Late Middle Korean clusters with p- and/or s-. According to S. Martin,10 “The laryngeal tension [of modern Korean tense consonants] continues on into the vowel, which can be described as ‘laryngealized’”. The development of tense consonants and vowels in Korean could be formulated as p/sCV > CCV > CV > C3V /CMV/ with the subscript dots used by Tangutologists to represent tenseness. Note that in modern Korean, only the tenseness of consonants is phonemic, whereas the tenseness of vowels is subphonemic. However, in Tangut, the tenseness of consonants was lost, so the tenseness of vowels became phonemic: *SCV > *CCV > *CV > *C3V > *CV /CV/ 3.1.1.3. Preinitial gutturals Gong Hwang-cherng11 found alternations between Tangut nonaspirated and aspirated initials. I derive these alternations from earlier *zero ~ *K-alternations. *K- was a voiceless velar, uvular, or glottal obstruent that devoiced voiced/ consonants: e.g., *Kb- > ph-, *Kd- > th-, *Kg- > kh-, *Kdʒ- > tʃh-, *Kl- > lh- Voiced consonants are preserved in nonprefixed members of voiced-aspirated cognate sets: e.g., 蟶 1gi < *gi ‘to fall, to lose’ 蕀 1khi < *K-gi ‘to let fall, to cause to lose’ 8 Gong 1988, pp. 805-811. At the time, Gong was using Sofronov’s reconstruction with “minor revisions” (1988, p. 784). Sofronov’s reconstruction did not have any retroflex vowels, so some of the lax-tense cognate sets in Gong (1988) would now be reintrepreted as nonretroflex-retroflex cognate sets in reconstructions with retroflex vowels like the reconstruction in Gong (2003) or the reconstruction in this paper. 9 Gong 1988, pp. 810-811. 10 Martin 1992, p. 27. 11 Gong 1988, pp. 785-796. Note that not all such sets involved a *K-prefix. Some doublets reflect different strata of borrowing from Chinese: one before devoicing and another after devoicing: e.g., 偸 1dza ‘mixed’ < Late Middle Chinese 雜 *dzap ‘id.’ (early loan) 汢 1tsha ‘mixed’ < Tangut Period Northwestern Chinese 雜 *tsha < Late Middle Chinese 雜 ‘id.’ (late loan; aspirated tsh- directly from Chinese rather than from pre-Tangut *K-dz-) *K- aspirated most voiceless obstruents: e.g., 枢 1ka < *ka ‘center’ 計 1kha < *K-ka ‘in’ (postposition) One might expect *k-k- to have merged with *S-k- and become k- followed by a tense vowel (see 3.1.1.2). If such a merger occurred, then 1kha ‘in’ must have had a non-*k- guttural preinitial (e.g., *x-ka; see below). If such a merger did not occur, then perhaps aspiration preceded tension, so *k-k- became kh- before *sk- became a new*kk- that was ultimately reduced to k- before a tense vowel: Early pre-Tangut *k-k- *Sk- Aspiration *kh- *Sk- Gemination *kh- *kk- Tangut kh- k- + tense vowel The relative chronology of the rules in this paper has yet to be worked out. One also might expect *Ks- to become an aspirated sh- like modern Burmese ဆ. However, there is no evidence for such an initial in Tangut. *K- may have conditioned tense vowels after s-: e.g., 教 1so ‘three’ may be from *so < *so < *sso < *xso < *Kso (cf. the g- of Written Tibetan gsum ‘three’). In Korean, *hVC- as well as *kC- developed into Late Middle Korean aspirates. 12 I assume Tangut also underwent similar sound changes and therefore cannot rule out the possibility of velar, uvular, and/or glottal fricative sources of aspiration: e.g., *xC- > Ch-. Modern Mawo Qiang, a distant relative of Tangut, has xC- and χC-clusters.13 3.1.2. Preinitial sonorants 3.1.2.1. Preinitial *r- Pre-Tangut preinitial *r- was one source of retroflexion in Tangut vowels: e.g., 忙 1lɨəəʳ < *rɯ-ləə ‘four’. For the other source of retroflexion, see 4.4.2.2. Retroflex vowels are very common in Tangut. Perhaps some were conditioned by preinitial *l- and even preinitial dental stops that merged with preinitial *r-: e.g., *TV- > *T- > *r-. Nonretroflex-retroflex cognate sets can be reconstructed with *Ø- ~ *r-: e.g., 飼 1za < *za ‘red face’ 12 Vovin 2010, p. 11; Lee and Ramsey 2011, p. 89. 13 Sun Hongkai 1981, p. 27. 苣 1zaʳ < *r-za ‘red-faced ancestor’ I reconstruct *r- as a prefix even in retroflex vowel words like 忙 1lɨəəʳ ‘four’ which lack nonretroflex vowel cognates within Tangut if they have *r-less exterior cognates: e.g., Written Tibetan bzhi < *b-lyi ‘four’ and Old Chinese 四 *s-li-s ‘four’. 3.1.2.2. Preinitial nasals? I do not know of any voiceless ~ voiced obstruent alternations that suggest *zero ~ *preinitial nasal alternations in pre-Tangut: e.g., *p- ~ *b- < *p- ~ *Np-, etc. However, perhaps some Tangut voiced obstruent initials are from pre-Tangut *preinitial nasal + obstruent initial sequences: e.g., b- < *Nb-, etc. 3.2. Presyllabic vowels The vowels of pre-Tangut presyllables have left two kinds of traces in Tangut. 3.2.1. Intervocalic lenition Pre-Tangut presyllables that were lost at a very late date conditioned the lenition of main syllable initials in intervocalic position: Early presyllable loss Late presyllable loss Fusion Early pre-Tangut *CV-CV& *CV-CV& *CV-CV& Loss of presyllabic *CV-CV& *CV-CV& *C-CV vowel; presyllable becomes preinitial Early loss of presyllable *CV *CV-CV& *C-CV& Lenition *CV& *CV-ClenitedV *C-CV& Late loss of presyllable *CV& *ClenitedV *C-CV& Forms subject to sound changes are in bold. All obstruents at the same point of articulation merged into a single lenited initial. The reflexes of Tangut lenition are similar to those of intervocalic lenition in Vietnamese and Korean. *Labials > v- (phonetically [β]?; cf. Middle Vietnamese [β] < *-p-, *-b- and Middle Korean [β] < *-p-) *Dentals > l- (cf. Middle Korean [r] < *-t-) *Alveolars > z- (cf. Middle Korean [z] < *-s-, *-ts-) *Alveopalatals > ʒ- (cf. Middle Vietnamese [ɟʑ] < *-c-, *-ɟ-) *Velars > ɣ- (cf. Middle Vietnamese [ɣ] < *-k-, *-g- and Middle Korean [ɣ] < *-k-) Lenition obscures etymological relationships: e.g., the Tangut cognate of Written Tibetan gcig ‘one’ and Old Chinese 隻 *tek ‘single’ is 岐 1lew < *kʌ-tek or *kʌ-tik. (I assume the pre-Tangut prefix had an initial *k- corresponding to Written Tibetan g-, though other initials are possible. See 3.2.2 for the reasoning behind reconstructing *ʌ as the vowel of the presyllable. See 4.4.1.1 for the *-k > -w shift.) 3.2.2. Stressed vowel warping In 2008, I proposed that the Old Chinese type A/B distinction was conditioned by presyllabic vowels.14 The following adaptation of that theory and A. Schuessler’s (2007, 2009) theory of vowel warping in Chinese can account for much of the large rhyme inventory of Tangut. I reconstruct at least two different vowels in Tangut presyllables: – a lower vowel symbolized15 as *ʌ (cf. the Middle Korean ‘minimal vowel’ ㆍ [ʌ]) – a higher vowel symbolized as *ɯ (cf. the Middle Korean ‘minimal vowel’ ㅡ [ɯ]) These vowels may have resulted from the merger of a larger number of even earlier unstressed vowels. Pre-Tangut main syllable vowels also belonged to lower and higher classes: Higher *i *u Lower *e *ə16 *o *a Pre-Tangut had partial vowel harmony (under Chinese influence?). If the height class of an unstressed presyllabic vowel matched the height class of a stressed vowel, the latter did not change either before or after presyllable loss: e.g., *Cɯ-Cí > *Cɯ-Cí > Ci (higher + higher) *Cʌ-Cá > *Cʌ-Cá > Ca (lower + lower) However, if the height class of an unstressed presyllabic vowel did not match the height class of a stressed vowel, the latter warped (partly lowered or raised) before the presyllable was lost: e.g., *Cʌ-Cí (lower + higher) > *Cʌ-Cəí > Cəi (lower + partly lowered) *Cɯ-Cá (higher + lower) > *Cɯ-Ciá > Cia (higher + partly raised) Partly lowered vowels developed into diphthongs beginning with ə: əu, əi. Partly raised vowels developed into diphthongs beginning with ɨ (after v-, l-, and alveopalatals) or *i (after all other initials): ɨa, ɨə, ɨe, ɨo ~ ia, iə, ie, io. (There are exceptions to this pattern of complimentary distribution.) The ɨ that resulted from partial raising is not to be confused with the ɨ that developed before high vowels after v-, l-, and alveopalatals (see 4.3.1). If a presyllable has lenited a following initial but has not warped a following stressed vowel, I reconstruct the presyllabic vowel with the height class of the stressed vowel: e.g., 岐 1lew < *kʌ-tek ‘one’17 (lower + lower) (*kɯ- with a higher vowel would have warped *e to *ɨe.) 痣 1ʒɨiw < *Cɯ-ʃuk ‘juniper tree’18 (higher + higher) (*Cʌ- with a lower vowel would have warped *u to *əu which would then have monophthongized to e before -w. For -ɨiw < *-uk, see 4.4.1.1.) 14 Miyake 2008. 15 I use the term ‘symbolized’ to indicate that *ʌ and *ɯ may not have been the precise phonetic values of the Tangut presyllabic vowels. They could have been central *ɨ and *ɐ, etc. What matters is their heights relative to each other. 16 It is also possible that *ə belonged to the higher vowel class of *i and *u, but then its behavior would be anomalous, as it would be the only higher class vowel that bent upward and never bent downward. 17 The pre-Tangut form could also have been *kʌ-tik. The lower vowel of the presyllable would have conditioned the warping of *i: *ik > *əik > ew. 18 Jacques (2004, p. 160; 2006) compared this Tangut word to Japhug rGyalrong ɕɤɣ ‘juniper tree’ and Written Tibetan shug-pa ‘juniper tree’. Medial -i- alternations19 may reflect earlier prefixes: e.g., 雖 1tshəu < *Cʌ-tshu ‘shovel’ (prefix conditioned vowel warping) 褝 1tshiu < *tshu ‘shovel’ (no prefix; *u became iu after *tsh-; see 4.3.2) However, “no semantic difference can be observed” between alternating forms. 20 Furthermore, these alternations occur mostly in words with u. These cognate sets may reflect interdialectal and/or dialect-internal variation in the pronunciation of /u/ rather than morphology. 3.2.3. Stressed vowel brightening Perhaps there were more than two kinds of presyllabic vowels. ‘Brightening’ (raising of *a to i) in Tangut21 may have been conditioned by high front vowels in presyllables: e.g., *CiCá > Ci (= Cji in Gong’s reconstruction used by Matisoff)22 The height of a palatal presyllabic vowel may have determined the degree of brightening: e.g., *Ce-Cá > Cie (= Cjij in Gong’s reconstruction used by Matisoff) with a partly high diphthong rather than Ci with a high monophthong. There are also sporadic cases in which pre-Tangut *a was raised to ə: e.g., 菱 1ŋwə < *PV-ŋa ‘five’ : Written Tibetan lnga, Old Chinese 五 ŋˁaʔ ‘id.’ I hesitate to reconstruct yet another presyllabic vowel to account for only a few instances. 4. Pre-Tangut stressed syllables 4.1. Pre-Tangut stressed syllable initials I tentatively project the Tangut initial inventory (see Appendix 1) back into pre-Tangut. A few Tangut initials may be secondary in origin: e.g., an initial may always be the result of lenition like Vietnamese g- [ɣ] which is only from *CV-K-. I presume that pre-Tangut had more stressed syllable initials than presyllabic initials: e.g., *k-, *kh-, *g- were possible stressed syllable velar stop initials, but *k- may have been the only possible presyllabic velar stop initial. All vowels after pre-Tangut syllable-initial *r- became retroflex: *rV > rVʳ. Note that medial *- r- did not condition retroflex vowels. See 4.2.4. A couple of external correspondences suggest that uvulars may have conditioned Tangut Grade II vowels ʊ and ɪ: 哦 1ɣʊ < *ɢu? ‘head’ : Baxter and Sagart’s (2012) Old Chinese 后 *ɢˤ(r)oʔ ‘sovereign’ (< 19 Gong 1988, pp. 796-798. 20 Gong 1988, p. 798. 21 Matisoff 2004. 22 The negative particle 閥 1mi, cognate to Old Chinese 無 *ma ‘not have’, may pose a problem for this derivation, as it would have to come from a sesquisyllabic *Ci-ma. Would such a high-frequency particle really be so phonologically complex? On the other hand, it is hard to believe that *ma would brighten to 1mi without any conditioning factor. Not all Tangut *a brightened, so one cannot attribute the raising to a regular vowel shift. ‘head of a state’), Written Tibetan mgo ‘head’ 瑩 1khɪ < *Ci-qha? ‘bitter’ : Mawo and Taoping Qiang qhɑ,23 Zhongu Tibetan qhɐnde ‘to be bitter’,24 Written Tibetan kha ‘bitter’, Baxter and Sagart’s (2012) Old Chinese *khˤaʔ (not *qhˤaʔ!). (See 4.2.4 for more on Grade II.) However, the reconstruction of uvulars in Old Chinese is still unsettled. A. Schuessler (2007; 2009) does not reconstruct them in Old Chinese. Moreover, note that Baxter and Sagart reconstruct a velar in *khˤaʔ ‘bitter’ instead of a uvular corresponding to a uvular in Qiang and Zhongu. N. Hill25 regarded Zhongu uvulars as being “due to the influence of a Qiangic substrate.” Perhaps the uvular in Old Chinese ‘head’ is primary whereas the uvular in Qiang and Zhongu ‘bitter’ is secondary.26 Did Tangut inherit a secondary uvular in ‘bitter’ from Proto-Qiangic? In any case, there is no strong evidence for a medial *-r- in either ‘head’ or ‘bitter’ that would normally condition Grade II (see 4.2.4), so the vocalism of those words needs another explanation. 4.2. Pre-Tangut medial glides 4.2.1. Pre-Tangut medial *-w- This medial is preserved in Tangut. It is primary waw, whereas secondary waw reflects an earlier *P- (see 3.1.1.1). 4.2.2. Pre-Tangut medial *-j- A palatal glide may be the source of some -ɨ- and -i- in Tangut: e.g., 昴 *sjeH > 2sie ‘knowledge’ : Written Tibetan shes-pa, Proto-Tibeto-Burman *syey-s ‘id.’27 It is also possible to derive 2sie from a yodless *Cɯ-seH with partial raising of *e. 4.2.3. Pre-Tangut medial *-rj- The pre-Tangut cluster *ʔrj- became Tangut ʔi-ʳ: e.g., 肛 *ʔrjat > 1ʔiaʳ ‘eight’ : Written Tibetan brgyad, Old Chinese *pˁret ‘id.’ 4.2.4. Pre-Tangut medial *-r- According to G. Jacques (2009), Gong (1993) derived his Grade II -i- from an earlier *-r-. Gong’s Grade II iV-diphthongs correspond to my Grade II lowered vowels: Pre-Tangut *ru *ri *ra *rə *re *ro Gong’s (none) ie ia iə iej io Grade II28 Grade II in ʊ ɪ æ ʌ ɛ ɔ this paper 23 Sun Hongkai 1981, p. 216. 24 Sun Jackson 2003, p. 772. 25 Hill 2010, p. 120. 26 I think it may be possible to reconstruct a uvular in Old Chinese ‘bitter’ on entirely internal grounds, enabling me to reconstruct a uvular at the Proto-Sino-Tibetan level for that word. 27 Matisoff 2003, p. 614. 28 Gong’s pre-Tangut forms might not necessarily correspond to mine. This vowel shift pattern is similar to what Schuessler (2007, 2009) reconstructed in Chinese: Old Chinese *râ > Later Han Chinese a (a low front vowel close to [æ] and distinct from back [ɑ]) Old Chinese *rə], *rê > Later Han Chinese *ɛ Old Chinese *rô > Later Han Chinese *ɔ In Chinese, this shift only occurred in type A syllables (indicated with circumflexes over vowels in Schuessler’s notation). Perhaps the Tangut shift only occurred in syllables with low vowels or partly lowered vowels: Pre-Tangut *rəu < *ru *rəi < *ri *ra *rə *re *ro after vowel lowering Grade II ʊ ɪ æ ʌ ɛ ɔ *-r- may have vanished before high vowels: Pre-Tangut *ru > *riu *ri *ria *riə *rie *rio after vowel raising Grade III ɨu ɨi ɨa ɨə ɨe ɨo Grade IV iu i ia iə ie io See 4.3.1 and 4.3.2 for the -ɨ- and -i- that developed before *u and *i. The correspondence of 恍 1tʃhɨiw ‘six’ to Written Tibetan drug ‘id.’ suggests that some Tangut alveopalatal affricates may be from *Tr-clusters. Perhaps ‘six’ was once *k-truk with a preinitial *k- that conditioned aspiration (see 3.1.1.3). (See 4.4.1.1 for the development of -ɨiw from *-uk.) 4.2.5. Pre-Tangut medial *-l-? There are several instances of Tangut lh- corresponding to Japhug rGyalrong k-presyllables followed by l, ɬ, or j < *lj- in Jacques (2006): e.g., 湘 1lhew < *-k ‘to graze’ : Japhug rGyalrong kɤ lɤɣ ‘id.’ These correspondences suggest that some Tangut lh- may be from *kl-. Pre-Tangut *-l- in other environments might have merged with another medial or disappeared without a trace. 4.3. Pre-Tangut stressed vowels I project the six basic vowel types of Tangut (u, i, a, ə, e, o; see Appendix 2) back into Proto- Tangut with only a few changes: – *-a is restored in ‘brightened’ syllables (see 3.2.3). – -ɨiw in ‘six’ and ‘juniper tree’ (see 3.2.2, 4.2.4) and perhaps other words is derived from *-uk (see 4.4.1.1). -iw may also sometimes be from *-uk. – -o is partly from *-aŋ,29 cf. Japhug rGyalrong -o < *-aŋ,30 and Tangut period Northwestern Chinese -o < *-aŋ). It is not clear whether the long vowels of Tangut are primary or secondary (see 4.4.4.1 and 4.4.4.2). So pre-Tangut may have had either six or twelve vowels (six short and six long). Nasalization, tensing, retroflexion, and diphthongization occurred later. Old Chinese as reconstructed by W. Baxter and L. Sagart (2012) also had the same basic six vowels as Tangut, though one should not expect simple one-to-one correspondences between the two vowel systems: e.g., Baxter and Sagart’s Old Chinese 馬 *mˁraʔ ‘horse’ may correspond to pre-Tangut *Cɯ-re (> Tangut 字 1rieʳ) ‘id.’, not *mraH. 4.3.1. Grade III -ɨ- The high vowels *i and *u became ɨi and ɨu after Grade III initials (v-, l-, and alveopalatals). *-ɨuk became -ɨiw (see 4.4.1.1). 4.3.2. Grade IV -i- The high vowel *u became iu after Grade IV initials (initials other than v-, l-, and alveopalatals) whereas *i remained unchanged. Tangut had no simple rhyme -u (see Appendix 2). This situation may have arisen under the influence of Late Middle Chinese whose *-u had similarly shifted to *-ɨu or *-iu,31 leaving a gap to be filled later by *-o after raising. 4.4. Pre-Tangut codas Although Tangut had no final obstruents and few final consonants, pre-Tangut once had a richer set of codas like its relatives Japhug rGyalrong, Classical Tibetan, Old Burmese, and Old Chinese. 4.4.1. Pre-Tangut obstruent codas 4.4.1.1. Pre-Tangut *-k *-k became -w after front vowels but disappeared elsewhere. See Gong (1995) for examples. Although *-ɨuk had a back vowel, this rule applied to this rhyme after *u dissimilated to a front vowel *i before a velar coda: *-ɨuk > *-ɨuɣ > *-ɨuɰ > *-ɨiɰ > *-ɨiw See ‘six’ (3.2.2) and ‘juniper tree’ (4.2.4). It is tempting to regard the long -aa of 氣懿 2miə-2nɨaa ‘Tangut’ (cf. Written Tibetan mi-nyag ‘id.’) as an instance of compensatory lengthening after the loss of *-k. However, other *-k words like 竃 1do < *dok ‘poison’; borrowed from Middle Chinese 毒 *dowk ‘id.’ have short vowels. Could the -aa of ‘Tangut’ be from *-aakH with an original long vowel? (The final *-H is the source of the second tone. See 4.5.) 4.4.1.2. Other pre-Tangut stop codas 29 Gong 1995. 30 Jacques 2004, p. 232. 31 Compare Kan-on 九 kiu ‘nine’ (borrowed from northwestern Late Middle Chinese) with Go-on ku ‘id.’ (borrowed from southern Early Middle Chinese prior to *u-breaking). The final *-p and *-t that one would expect from comparison with Old Chinese, Written Tibetan, and Old Burmese have vanished without a trace: e.g., 蠖 *Cʌ-kap > 1ɣa ‘needle’ : Japhug rGyalrong ta-qaβ, Written Tibetan khab ‘id.’) 肛 *ʔrjat > 1ʔiaʳ ‘eight’ : Written Tibetan brgyad, Old Chinese *pˁret ‘id.’) There are a few instances of long vowels in probable *-t words: e.g., 螺 *Cɯ-maat > 1miaa ‘fruit’ : Japhug rGyalrong sɯ-mat ‘id.’ but these vowels may be primary long vowels rather than remnants of lost stops. 4.4.1.3. Pre-Tangut fricative codas See 4.5. 4.4.2 Pre-Tangut sonorant codas 4.4.2.1. Pre-Tangut nasal codas Nasals disappeared after all vowels, leaving behind nasalization in some cases with at least two major exceptions: – There are no native nasalized u-syllables. All nasalized u-syllables are Chinese borrowings. – *-aŋ became -o (see 4.3). 4.4.2.2. Pre-Tangut liquid codas Final *-r is another source of vowel retroflexion: e.g., 敢 1kaaʳ < *kaar ‘to measure’ : Japhug rGyalrong kɤ-skɤr ‘to weigh’ Since a final -Nr or -rN cluster is absent from languages of the region, I assume that the nasalized retroflex vowels of Tangut rhymes 65, 76, 97, and 98 originated from preinitial *r- + final *- N sequences: *r-CVN > CVʳ. 4.5. Pre-Tangut tonogenetic codas Tangut had two basic tones, a ‘level tone’ and a ‘rising tone’. 32 The terms were obviously adopted from the Chinese phonological tradition and may not be meant to be taken at face value as descriptions of tonal contours. They may have meant nothing more than ‘first category’ and ‘second category’. They could even have referred to phonations rather than tones, but I will continue to use the traditional term ‘tone’. Given that the Tangut level tone was much more common than the Tangut rising tone and that the rising and departing tones of Middle Chinese originated from Old Chinese final glottals, I derive the Tangut rising tone from a lost final glottal *-H. This *-H in turn may be from an even earlier *-s (cf. Old Chinese *-s and Written Tibetan -s) and/or *-ʔ (cf. Old Chinese *-ʔ). Tonal alternations33 arose from zero ~ *-H alternations. An *-H suffix could be added after other codas: e.g., the rising tone word 程 2lew < *Cʌ-tek-H or *Cʌ-tik-H ‘same’34 32 I will not deal with the ‘entering tone’ in the Precious Rhymes of the Sea of Characters and other tonal oddities here. 33 Gong 1988, pp. 821-832. 34 I am not sure whether ‘same’ had the same numerical *kʌ-prefix as ‘one’. The unwarped nonhigh e of 2lew necessitates is a suffixed cognate of the level tone word 岐 1lew < *kʌ-tek or *kʌ-tik ‘one’. Old Chinese *-s could also follow any coda. Written Tibetan -s has a more restricted distribution; homorganic -Cs sequences are not possible. If a Tangut rising tone word has no known level tone cognates, its *-H can be tentatively regarded as part of its root unless external comparison reveals that the *-H is a suffix. Conclusion The pre-Tangut phonological system that I have reconstructed in this paper brings Tangut typologically closer to Old Chinese while also accounting for Tangut-internal morphological alternations. It is far from a finished product, as it is based only on a small number of examples. Application of my hypotheses to the Tangut lexicon as a whole will undoubtedly result in the reformulation or even rejection of some of my proposals. Nonetheless, I remain confident that Tangut phonological history will eventually be integrated into the larger saga of monosyllabic compression across the Sinosphere. Appendix 1. Tangut initials This system is nearly identical to Gong (2003). I write his w tś tśh dź ś ź · as v tʃ tʃh dʒ ʃ ʒ ʔ. Roman numerals refer to the initial classes of the Tangut 同音 Homophones dictionary. Unlike Nishida (1964) or Arakawa (1999), neither Gong nor I reconstruct distinct initials for class IV. Alternative phonetic interpretations are in the right-hand column. I p- ph- b- m- II v- [w]? III t- th- d- n- V k- kh- g- ŋ- VI ts- tsh- dz- s- VII tʃ- tʃh- dʒ- ʃ- retroflex [tʂ tʂh dʐ ʂ]? VIII ʔ- x- ɣ- glottal [ʔ h ɦ]? IX l- lh- z- ʒ- r- [ɫ ɬ ɮ ʐ r]? v-, l-, and the alveopalatals were usually followed by Grade III rhymes with -ɨ- rather than Grade IV rhymes with -i-. There was something antipalatal about those consonants, so I suspect l- may have been velarized [ɫ] and the alveopalatals were really retroflexes. The correspondence of tʃh- to Written Tibetan dr- in ‘six’ (4.2.4) suggests that the alveopalatals might have been retroflexes. v- may have been [w] like Polish ł or Belarusian ў from earlier nonpalatalized l. However, Tibetan transcriptions of v- as <b(w)>, <ḥbh> and even <ww>35 suggest that Tangut v- had more friction than w-. Appendix 2. Tangut rhymes This system is a revision of Gong (2003). Although the phonetic values are somewhat different, the rhyme groups are nearly identical to his. Grade III and IV rhyme numbers marked with a and b are in complementary distribution. Rhymes unique to Chinese loanwords have no pre-Tangut sources and hence are in parentheses. the reconstruction of a nonhigh *ʌ in the presyllable. 35 Nishida 1964, p. 82-83; Tai 2008, pp. 177-178. Variants of rhymes with medial -w- are not listed. Pre-Tangut basic vowel Grade I Grade II Grade III Grade IV *u 1. -əu 4. -ʊ36 2. -ɨu 3. -iu 5. -əəu 6. -ʊʊ37 7a. -ɨuu 7b. -iuu (104. -əu) 61. -əu 62a. -ɨu 62b. -iu 80. -əuʳ 81. -iuʳ *i 8. -əi 9. -ɪ 10. -ɨi 11. -i 12. -əəi 13. -ɪɪ 14a. -ɨii 14b. -ii 15. -əi 16a. -ɨi 16b. -i 68. -əi 69. -ɪ 3 70a. -ɨi 70b. -i 82. -əiʳ 83. -ɪʳ 84a. -ɨiʳ 84b. -iʳ 99. -əəiʳ 101a. -ɨiiʳ 101b. -iiʳ *a 17. -a 18. -æ 19. -ɨa 20. -ia 22. -aa 23. -ææ 21. -ɨaa 24. -iaa 25. -a 26. -æs 27a. -ɨa 27b. -ia 66. -a 67a. -ɨa 67b. -ia 85. -aʳ 86. -æʳ 87a. -ɨaʳ 87b. -iaʳ 88. -aaʳ 89a. -ɨaaʳ 89b. -iaaʳ (105. -ya) *ə 28. -ə 29. -ʌ 30. -ɨə 31. -iə 32. -əə 33a. -ɨəə 33b. -iəə 71. -ə3 72a. -ɨə3 72b. -iə3 90. -əʳ 91. -ʌʳ 92a. -ɨəʳ 92b. -iəʳ 100a. -ɨəəʳ 100b. -iəəʳ *e 34. -e 35. -ɛ 36. -ɨe 37. -ie 38. -ee 39. -ɛɛ 40a. -ɨee 40b. -iee 41. -e 42. -ɛs 43a. -ɨe 43b. -ie 76. -ɛs 3 65a. -ɨes 3 65b. -ies 3 63. -ɛ 3 64a. -ɨe 64b. -ie 77. -eʳ 78. -ɛʳ 79a. -ɨeʳ 79b. -ieʳ *ik/ek/uk 44. -ew 45. -ɛw 46a. -ɨew 46b. -iew < *-ik/-ek? < *-ek only? < *-ek only < *-ek only 36 Gong Hwang-cherng classified rhyme 4 as Grade I and reconstructed it as homophonous with Grade I rhyme 1. However, there are minimal pairs distinguishing rhymes 1 and 4, so the two rhymes must have been distinct. Since rhymes 2 and 3 were Grades III and IV, rhyme 4 might have been Grade II. Unfortunately, there are no diagnostic Grade II initials (v-, l-, alveopalatals) in rhyme 4 syllables. However, the order of Tangut rhymes seems to be based on a Chinese model, and the first four Tangut rhymes (Grade I 1, Grade III 2, Grade IV 3, and Grade II 4) apparently correspond to the first three Middle Chinese rhymes (Grade I 東/冬, Grade III/IV 鐘, and Grade II 江). Moreover, there are no alveolar initials unique to Grades I and IV in rhyme 4 syllables. Rhyme 4 can only be Grade II or Grade III (as in Arakawa 1999). 37 Gong classified the extremely rare rhyme 6 as Grade III. There are only two different rhyme 6 syllables, khʊʊ and ʒʊʊ. kh- and ʒ- can only coexist in Grade II, so I classify rhyme 6 as Grade II. 47a. -ɨiw 47b. -iw < *-ik, *-uk < *-ik, (*-uk?) 93. -eʳw 94. -i(e)ʳw < *-ik/-ek? < *-ik/-ek? *o 51. -o 52. -ɔ 53a. -ɨo 53b. -io 50. -wɨo 54. -oo 55a. -ɔɔ 55b. -ɨoo 55c. -ioo 56. -o 57. -ɔs 58a. -ɨo 58b. -io 59. -ɔsɔs 60a. -ɨoo 60b. -ioo 73. -o 74. -ɔ3 75a. -ɨo 75b. -io 95. -oʳ 96a. -ɔʳ 96b. -ɨoʳ 96c. -ioʳ 102. -ooʳ 103. -iooʳ 97. -oʳ 98. -ioʳ The *o-rhymes had some unusual characteristics (i.e., a separate rhyme 50 -wɨo distinct from 53a -ɨo which could also be preceded by -w-; a three-way split of rhymes 55 and 96) that deserve investigation. 50 -wɨo could only have the level tone, whereas 53a -wɨo with -w- could only have the rising tone. Perhaps /oo/ was [ɔɔ] after the high vowels /ɨ i/, so 55a [ɔɔ] could rhyme with 55b [ɨɔɔ] and 55c [iɔɔ] and 96a [ɔɔʳ] could rhyme with 96b [ɨɔɔʳ] and 96c [iɔɔʳ]. Bibliography Arakawa 1999 – Arakawa Shintarō 荒川慎太郎 . “Ka-zō taion shiryō kara mita Seikago no seichō” [A Study on Tangut Tones from Tibetan Transcriptions] 夏藏対音資料からみた西夏語の声調. Gengogaku Kenkyū 言語学研究 [Linguistic Research] 17-18 (1999), pp. 27-44. Baxter and Sagart 2012 - Baxter William and Sagart Laurent. “Baxter-Sagart Old Chinese reconstruction (Version 1.00).” http://crlao.ehess.fr/docannexe.php?id=1202 Accessed January 27, 2012. Clauson 1964 – Clauson Gerard. “The Future of Tangut (Hsi Hsia) Studies.” Asia Major 11.1 (1964), pp. 54-77. Gong 1988 – Gong Hwang-cherng 龔煌城. “Phonological Alternations in Tangut.” Zhongyang yanjiu lishi yuyuan yanjiusuo jikan 中央研究院歷史語言研究所集刊 [Bulletin of the Institute of History and Philology] 59.3 (1988), pp. 783-834. Gong 1993 – Gong Hwang-cherng 龔煌城. “Xixiayu yu qiangyuzhi yuyan tongyuanci de lishi cengci” [Cognate Historical Strata of Tangut and the Qiangic Languages] 西夏語與羌語支語言同源的歷史層次. Unpublished manuscript, 1993. Gong 1995 – Gong Hwang-cherng 龔煌城. “The System of Finals in Proto-Sino-Tibetan.” In: The Ancestry of the Chinese Language, edited by William S. Y. Wang. Berkeley: Project on Linguistic Analysis, University of California, 1995, pp. 41–92. Gong 1999 – Gong Hwang-cherng 龔煌城. “Xi-Xia yu de jin muyin ji qi qiyuan” [Tangut Tense Vowels and Their Origin] 西夏語的緊母音及其起源. Zhongyang yanjiu lishi yuyuan yanjiusuo jikan [Bulletin of the Institute of History and Philology] 中央研究院歷史語言研究所集刊 70.2 (1999), pp. 531-558. Gong 2003 - Gong Hwang-cherng 龔煌城. “Tangut.” In: The Sino-Tibetan Languages, edited by Graham Thurgood and Randy J. LaPolla. London: Routledge, 2003, pp. 602-620. Hill 2010 – Hill Nathan W. “An Overview of Old Tibetan Synchronic Phonology.” Transactions of the Philological Society 108.2 (2010), pp. 110–125. Jacques 2004 – Jacques Guillaume. “Phonologie et morphologie du japhug (rGyalrong).” PhD Diss. Université Paris Diderot, 2004. Jacques 2006 – Jacques Guillaume. “Essai de comparaison des rimes du tangoute et du rgyalrong.” In Medieval Tibeto-Burman Languages II – 10th Seminar of the International Association for Tibetan Studies, ed. Christopher Beckwith. Leiden: Brill, 2006. Jacques 2009 – Jacques Guillaume. “The Origin of Vowel Alternations in the Tangut Verb.” Language and Linguistics 10.1 (2009), pp. 17-27. Kychanov and Sofronov 1963 – Кычанов Е.И, Софронов М.В. Исследования по фонетике тангутского языка (предварительные результаты). М.: Издательство восточной литературы, 1963. Lee and Ramsey 2011 – Lee Ki-Moon 李基文 and Ramsey S. Robert. A History of the Korean Language. Cambridge: Cambridge University Press, 2011. Li 2008 – Li Fanwen 李範文 . Xia-han zidian [Tangut-Chinese Dictionary] 夏漢字典 . 2nd edition. Beijing: Chinese Academy of Social Sciences, 2008. Martin 1992 – Martin Samuel E. A Reference Grammar of Korean. Rutland: Charles E. Tuttle, 1992. Matisoff 2004 – Matisoff James A. “‘Brightening’ and the place of Xi-Xia (Tangut) in the Qiangic Subgroup of Tibeto-Burman.” In: Studies on Sino-Tibetan Languages: Papers in Honor of Professor Hwang-cherng Gong on His Seventieth Birthday, Ed. by Ho Dah-an. Taipei: Institute of Linguistics, Academia Sinica, 2004, pp. 327-352. Miyake 2008 – Miyake Marc Hideo. “Avoiding ABBA: Old Chinese Syllabic Harmony.” In: Evidence and Counter-Evidence: Essays in Honour of Frederik Kortlandt. Vol. 2. General Linguistics. Ed. by Alexander Lubovsky, Jos Schaeken and Jeroen Wiedenhof. Amsterdam: Rodopi, 2008, pp. 283- 301. Nishida 1964 – Nishida Tatsuo 西田龍雄 . Seikago no kenkyū [A Study of the Hsi-Hsia Language] 西夏語の研究. Tokyo: Zauhō kankōkai 東京：座右寶刊行會 1964. Sagart 1999 – Sagart Laurent. The Roots of Old Chinese. Amsterdam: John Benjamins, 1999. Schuessler 2007 – Schuessler Axel. ABC Etymological Dictionary of Old Chinese. Honolulu: University of Hawai‘i Press, 2007. Schuessler 2009 – Schuessler Axel. Minimal Old Chinese and Later Han Chinese: A Companion to Grammata Serica Recensa. Honolulu: University of Hawai‘i Press, 2009. Sun Hongkai 1981 – Sun Hongkai 孙宏开. Qiangyu jianzhi 羌语简志 [A Brief Description of the Qiang Language]. Beijing: Minzu chubanshe 北京：民族出版社, 1981. Sun Jackson 2003 – Sun Jackson. “Phonological Profile of Zhongu: A New Tibetan Dialect of Northern Sichuan.” Language and Linguistics 4.4 (2003), pp. 769-836. Tai 2008 – Tai Chung Pui 戴忠沛. “Xi-Xia wen fojing canpian de zangwen duiyin yanjiu” [A Study of Tibetan Phonological Transcription in Tangut Buddhism Fragments] 西夏文佛經殘片的藏文對音研究. PhD Diss., Graduate School of Chinese Academy of Social Sciences, 2008. Vovin 2010 – Vovin Alexander. Koreo-Japonica: A Re-Evaluation of a Common Genetic Origin. Honolulu: University of Hawai‘i Press, 2010.

Log In

Complexity from Compression: A Sketch of Pre-Tangut

Complexity from Compression: A Sketch of Pre-Tangut

Complexity from Compression: A Sketch of Pre-Tangut

Complexity from Compression: A Sketch of Pre-Tangut

Complexity from Compression: A Sketch of Pre-Tangut

Related Papers

RELATED PAPERS