Wiktionary talk:About Chinese/Cantonese/Taishanese

From Wiktionary, the free dictionary
Latest comment: 11 months ago by Justinrleung in topic Mandarin examples?
Jump to navigation Jump to search

Deng Jun's Taishanese romanisation system[edit]

Hi, I have a copy of a Taishanese dictionary (台山方音字典) edited by Deng Jun (邓钧). →ISBN, photos here) Their romanisation system is rather interesting, with

  • unreleased final consonants being romanised as d, g, b instead of t, k, p; and
  • an /i/ after initial /s/ being transcribed as xi (e.g. 十 xib`, 晨 xin*) instead of si, which I can only assume represents a shift of /s/ to [ɕ] before /i/.

Is it okay with you if I add this romanisation system into a column in the transcription tables? Since this is the only published Taishanese dictionary available (to the best of my knowledge), I think it's worth covering here. Chagneling (talk) 07:41, 19 June 2017 (UTC)Reply

It sounds fairly typical of Mainland romanizations (they try to imitate pinyin; personally I think w:Guangdong_Romanization#Cantonese is an abomination!)
Normally I would say "go ahead" but currently this page is generated by code (for various reasons); describe the necessary changes and I'll add them for you (and maybe then decouple this page from the code as has been requested recently elsewhere...). —suzukaze (tc) 07:58, 19 June 2017 (UTC)Reply
It looks like it's too much to describe in one post, so I'm recreating the tables in Google Docs and I'll post the link once I'm done. Might take a while. Chagneling (talk) 09:21, 19 June 2017 (UTC)Reply
While working on it, I've noticed that the examples for i and ei (皮 and 耳 respectively) don't work—my dictionary states that they are pei* and ngei- (ngi- is listed as an alternate pronunciation).
Stephen Li states that they are pi22 and ŋi55, and Gene Chin's dictionary agrees that they have the same rhyme as well (pĩ and ngī).
That said, I'm not sure what to put as examples instead. The Duanfen dialect I speak doesn't distinguish the ei and i sounds.
And I don't know if a distinction between ei and i needs to be made at all. I'm not sure if the Taicheng dialect has the ei and i sounds in complementary distribution or not—my dictionary sticks to one or the other depending on preceding consonant (e.g. gei, hei, ji, ngei, yi) save for the three exceptions 幾 gi- (however, 幾何學、幾乎、茶几 all are gei-), 嘻 hi-, and 耳 ngei-/ngi-. Chagneling (talk) 11:57, 19 June 2017 (UTC)Reply
I've completed the tables here. I'm not sure if you got a Wiktionary notification for my earlier reply — hopefully this one comes through. Chagneling (talk) 09:27, 20 June 2017 (UTC)Reply
The /ɵt̚/, /ut̚/ distinction is in this database of dialectal pronunciations (現代>粵語>四邑片>台山(台城)). Strangely, it seems like /l/ is the only possible initial for /øt/, but /lut/ can be found in the database as well.
Actually, most of this page is based on it since it was the only resource I had access to that clearly marked itself as describing the Taicheng dialect and seemed reliable (admittedly it is not an ideal source, even though the url might include .edu; see also Template_talk:zh-pron#Parameter_for_Taishanese_needed). If you know it is wrong, please feel free to point it out so that our coverage of Taishanese can be better. It's also where /i/ vs. /ei/ came from, as well as the idea of combining /ein/ and /ian/. (IIRC we've also had anonymous users change -ei to -i a few times...) —suzukaze (tc) 22:51, 20 June 2017 (UTC)Reply
@Suzukaze-c, Chagneling The -ei/-i distinction is discussed here by Stephen Li. — justin(r)leung (t...) | c=› } 02:49, 21 June 2017 (UTC)Reply
Also, I don't understand why Stephen Li distinguishes between /ein/ and /ian/. His syllable chart doesn't even include /ian/. — justin(r)leung (t...) | c=› } 02:55, 21 June 2017 (UTC)Reply
IRT -ei/-i distinction: Thanks for the link! I didn't realise it was documented and I'm a bit guilty for not doing my research before mentioning it here. What Stephen Li states matches up precisely with my dictionary's romanisations, which is quite convenient.
IRT ein and ian: So they might either be a dialectal thing, or romanisation inconsistencies on his part then.
@Suzukaze-c Ah, I see. It's getting quite late here (and I'm too tired to attempt to translate the interface), so I'll check out the site you linked sometime tomorrow. Chagneling (talk) 13:31, 21 June 2017 (UTC)Reply
@Suzukaze-c I've finally gotten around to checking out the Hanyu Gujinyin Ziliaoku site you linked (sorry for taking so long!) and I'm honestly amazed at how much information the site offers. Definitely something for the bookmarks list. :P
Regarding /l/ being the only initial possible for /øt̚/ (or is it /ɵt̚/ as per your IPA?): there's ultimately only two characters transcribed by the site as /løt̚/ (栗 and 律, since the other two are just character variants), and one as /lut̚/ (劣, since the other character is a variant). I think it's possible that with 劣 there could actually just be a typo of /løt/. If this is true (and with the tiny sample size I can't be sure), /ɵt̚/ would just be how /ut̚/ is pronounced before /l/ (emphasising that this is just speculation on my part).
Also, some suggestions I forgot to mention:
  • The "i" in Deng Jun can now be definitively put down as being the same as Wiktionary "i", and the "ei" as Wiktionary "ei".
  • I rechecked the entry for 吾 in my dictionary (which I was confused about since it was listed under both "m" and "ng" in the lookup tables). It turns out that in the dictionary entry it's stated to be "ng (m)", which is how all characters with syllabic nasals are romanised (e.g. 五、午) except for 唔 "m". So their putting 吾 in two sections in the lookup tables was likely a mistake on their part, which makes "m" in Wiktionary's romanisation almost exclusively romanised as "ng (m)" in Deng Jun.
    • Perhaps not coincidentally, this matches up with how m and ng are distinguished in Guangzhou Cantonese, and how these two syllabic nasals are merged in Hong Kong. Looks like Deng Jun is noting how this merger is happening (or has already happened? I'm not sure) in Taishanese as well.
  • I think "high falling" should be renamed to "mid falling" - my dictionary describes the tone as "中降调(32)", and Gene Chin's hoisanva site describes it as a "middle falling tone". Chagneling (talk) 09:32, 24 June 2017 (UTC)Reply
@Chagneling I found a paper that discusses Taicheng phonology (and Siyi generally). It does distinguish /øt̚/ and /ut̚/, just like Xiaoxuetang does. Also, it does mention palatization, which you did mention above (good job!). This paper is more recent and might be more accurate than Wang Li's paper, but I'm not sure. @Suzukaze-c, Wyang, what do you think? — justin(r)leung (t...) | c=› } 17:14, 24 June 2017 (UTC)Reply
The paper also says that [m̩] and [ŋ̍] are basically in free variation, so it has probably been merged in Taicheng. And I do agree on changing high falling to mid falling (which I've done). — justin(r)leung (t...) | c=› } 17:21, 24 June 2017 (UTC)Reply

RFC discussion: March–June 2017[edit]

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is currently invoking a user sandbox module. It should be changed to a regular module. —CodeCat 21:00, 24 March 2017 (UTC)Reply

@Suzukaze-c, Wyang Should this be put in MOD:yue-pron? — justin(r)leung (t...) | c=› } 03:55, 26 March 2017 (UTC)Reply
Support +1. Wyang (talk) 12:17, 26 March 2017 (UTC)Reply
Are we sure the other sources record the Taicheng dialect?—suzukaze (tc) 08:00, 30 April 2017 (UTC)Reply

Also, I just noticed its only categories are tracking/cleanup categories. It should probably be put in a "real" category. —CodeCat 14:01, 26 March 2017 (UTC)Reply

The module invocation is gone. Now it still needs attention to its categories. —suzukaze (tc) 05:17, 22 June 2017 (UTC)Reply


Correspondences[edit]

local corresp = {
	['initial'] = {
		-- ['wiktionary'] = { 'xiaoxuetang + ? (ipa)', 'stephen li', 'gene chin', 'dli' }

		['']   = { '',    '',   '',   ''    },

		['b']  = { 'p',   'b',  'b',  'p'   },
		['p']  = { 'pʰ',  'p',  'p',  'p’'  },
		['m']  = { 'ᵐb',  'm',  'm',  'm'   },
		['f']  = { 'f',   'f',  'f',  'f'   },
		['v']  = { 'v',   'v',  'v',  'w'   },

		['d']  = { 't',   'd',  'd',  't'   },
		['t']  = { 'tʰ',  't',  't',  't’'  },
		['n']  = { 'ⁿd',  'n',  'n',  'n'   },
		['l']  = { 'l',   'l',  'l',  'l'   },
		['lh'] = { 'ɬ',   'ɬ',  'x',  'lh'  },

		['g']  = { 'k',   'g',  'g',  'k'   },
		['k']  = { 'kʰ',  'k',  'k',  'k’'  },
		['ng'] = { 'ᵑɡ',  'ŋ',  'ng', 'ng'  },

		['z']  = { 't͡s',  'dz', 'j',  'ch'  },
		['c']  = { 't͡sʰ', 'ts', 'ch', 'ch’' },
		['y']  = { 'ʒ',   'y',  'y',  'y'   },

		['s']  = { 's',   's',  's',  's'   },
		['h']  = { 'h',   'h',  'h',  'h'   },
	},

	['final'] = {
		['']     = { '',    '',    '',      ''       },

		['a']    = { 'a',   'a',   'a符',   'a符'    },
		['ai']   = { 'ai',  'ai',  'a符i',  'aai符'  },
		['au']   = { 'au',  'ɔu',  'a符o',  'aau符'  },
		['am']   = { 'am',  'am',  'a符m',  'aa符m'  },
		['an']   = { 'an',  'an',  'a符n',  'aa符n'  },
		['ang']  = { 'aŋ',  'aŋ',  'a符ng', 'aa符ng' },
		['ap']   = { 'ap̚',  'ap',  'a符p',  'aa符p'  },
		['at']   = { 'at̚',  'at',  'a符t',  'aa符t'  },
		['ak']   = { 'ak̚',  'ak',  'a符k',  'aa符k'  },

		['i']    = { 'i',   'i',   'i符',   'i符'    },
		['iu']   = { 'iu',  'iu',  'iu符',  'iu符'   },
		['im']   = { 'im',  'im',  'i符m',  'i符m'   },
		['in']   = { 'in',  'in',  'i符n',  'i符n'   },
		['ip']   = { 'ip̚',  'ip',  'i符p',  'i符p'   },
		['it']   = { 'it̚',  'it',  'i符t',  'i符t'   },

		['ie']   = { 'iɛ',  'ia',  'e符h',  'e符'    },
		['iau']  = { 'iau', 'iau', 'e符l',  'iau符'  },
		['iam']  = { 'iam', 'iam', 'e符m',  'ie符m'  },
		['iang'] = { 'iaŋ', 'iaŋ', 'e符ng', 'ia符ng' },
		['iap']  = { 'iap̚', 'iap', 'e符p',  'ie符p'  },
		['iak']  = { 'iak̚', 'iak', 'e符k',  'ia符k'  },

		['u']    = { 'u',   'u',   'u符',   'oo符'   },
		['ui']   = { 'ui',  'ui',  'ui符',  'ooi符'  },
		['un']   = { 'un',  'un',  'u符n',  'oo符n'  },
		['ut']   = { 'ut̚',  'ut',  'u符t',  'oo符t'  },

		['ei']   = { 'ei',  'i',   'i符',   'i符'    },
		['eu']   = { 'eu',  'əu',  'e符o',  'aau符'  },
		['em']   = { 'em',  'əm',  'e符im', '?'      }, -- ? = missing (skimming a 1000 page pdf is not interesting)
		['en']   = { 'en',  'ein', 'e符in', 'ie符n'  },
		['ep']   = { 'ep̚',  'əp',  'e符ip', '?'      },
		['et']   = { 'et̚',  'ɛt',  'e符ik', '?'      },

		['uung'] = { 'ɵŋ',  'əŋ',  'u符ng', 'u符ng'  },
		['uut']  = { 'ɵt̚',  'ut',  'u符t',  '?'      },
		['uuk']  = { 'ɵk̚',  'ək',  'u符k',  'u符k'   },

		['o']    = { 'ᵘɔ',  'ɔ',   'o符',   'o符'    },
		['oi']   = { 'ᵘɔi', 'ɔi',  'o符i',  'oi符'   },
		['on']   = { 'ᵘɔn', 'ɔn',  'o符n',  'o符n'   },
		['ong']  = { 'ɔŋ',  'ɔŋ',  'o符ng', 'o符ng'  },
		['ot']   = { 'ᵘɔt̚', 'ɔt',  'o符t',  'o符t'   },
		['ok']   = { 'ɔk̚',  'ɔk',  'o符k',  'o符k'   },

		['m']    = { 'm̩',   'm',   'm符',   'm符'    },

--[==[         no correspondences in other sources(?)
		['']     = { '',    '',    '',      'ei'     },
		['']     = { '',    '',    'en',    ''       },
		['']     = { '',    '',    'et',    ''       },
]==]

--[==[         multiple correspondences
		['en']   = { '',    '',    '',      'ie符n'  }, -- cf. dli v1.p19 /kien/king/ "to see" vs stephen li /gein/ and dli /too-ing/ vs stephen li /du ein/; possibly to-do
		['en']   = { '',    '',    '',      'i符ng'  },
		['et']   = { '',    '',    '',      'i符k'   }, -- cf. 熱, 力, 節
		['et']   = { '',    '',    '',      'iet'    }, -- cf. dli v2.p59 /chiet/ vs stephen li /dzet/

		['eu']   = { '',    '',    '',      'aau符'  },
		['au']   = { '',    '',    '',      'aau符'  },

		['i']    = { 'i',   'i',   'i符',   'i符'    },
		['ei']   = { 'ei',  'i',   'i符',   'i符'    }, -- cf. 琵, 奇, 耳, 肥
]==]
	},

	['tone'] = { '33', '55', '22', '21', '32' }, -- stephen li
	['tone_ch'] = { '335', '55', '225', '215', '325' }, -- stephen li

	['tone_gch'] = { '̈', '̄', '̃', '̂', '̀' },
	['tone_dli'] = { '̀', '', '̄', '̣̄', '̂' },

--[==[
	['tone'] = { '33', '55', '22', '21', '31' }, -- xiaoxuetang
	['tone'] = { '33', '55', '11', '21', '32' }, -- wikipedia
	['tone_ch'] = { '35', '55', '15', '215', '325' }, -- wikipedia
	['tone'] = { '33', '55', '11', '21', '31' }, -- gene chin
	['tone_ch'] = { '35', '55', '15', '215', '325' }, -- gene chin
	['tone'] = { '33', '55', '11', '10', '42' }, -- dli
	['tone'] = { '44', '66', '22', '31', '52' }, -- Cheng, Teresa M. (1973), "The Phonology of Taishan"
	['tone_ch'] = { '447', '66', '226~227', '317', '527' }, -- Cheng
]==]
}

Suzukaze-c 05:16, 26 May 2020 (UTC)Reply

Observations on Problems with the Romanization of Sources[edit]

I noticed that the Stephen Li source has the vowel "-iat" like in characters "虱" (siat55), "舌" (siat32), and "蚀" (siat32), which isn't found in its pronunciation guide, and also not yet found in this project page guide. Looking at Gene Chin's source for these specific characters, "虱" is "sēt", "舌" is "sêt", and "蚀" is "sèt", but "-et" for Gene Chin isn't also in the guide. --Mar vin kaiser (talk) 07:03, 30 December 2020 (UTC)Reply

In the meantime, I'll transcribe these sounds as "-et" in Wiktionary romanization, since that's what makes sense. --Mar vin kaiser (talk) 07:09, 30 December 2020 (UTC)Reply
There is also the "-en" vowel in Gene Chin's source that isn't in the guide. I transcribed it as is in Wiktionary, "-en". --Mar vin kaiser (talk) 07:59, 30 December 2020 (UTC)Reply
There's also the "-eing" vowel in Stephen Li's source that's not in the guide. I transcribe it as "-en" in Wiktionary. --Mar vin kaiser (talk) 10:52, 30 December 2020 (UTC)Reply
There's also the "-ou" vowel in Stephen Li's source, in some words with 仔, which looks like typos and are supposed to be "-oi", especially if you listen to the audio. But sometimes, they don't sound like "-oi", like in 斧頭仔. On second thought, it does sound a bit like "-oi" --Mar vin kaiser (talk) 01:00, 31 December 2020 (UTC)Reply
There's also "-ian" in Stephen Li's source for 燕, which is "-en" in Gene Chin, so I'll transcribe it as "-en". --Mar vin kaiser (talk) 11:17, 1 January 2021 (UTC)Reply
@Mar vin kaiser: Sorry for the late reply. Stephen Li seems to have some inconsistencies in the romanization in his dictionary since these rimes you mention aren't in his syllable chart. You should confirm using 台山方音字典 and/or Xiaoxuetang.
From Xiaoxuetang (supplemented with 珠江三角洲方言字音對照): 蝨 = set55 (set2), 舌 = set11 (set4, 陽上 changed tone, equivalent to the 21 tone in our IPA), 蝕 = set21 (set5, 陽入, equivalent to the 32 tone in our IPA), 燕 = zen21 (yen4, 陽上, probably a changed tone).
From 台山方音字典: 蝨 = sed- (set2), 舌 = sed> (set4, 陽上, probably changed tone), 蝕 = sed` (set5), 燕 = yen> (yen4, 陽上, probably a changed tone), yen' (yen1).
So, yes, the -iat (probably an inconsistency meant to be -ɛt, since it's not in his table of syllables) in Stephen Li and -et in Gene Chin is our -et. Other idiosyncracies in Stephen Li: the -eiŋ rime is likely a remnant of an older version of the romanization and should be -ein (our -en); -ou for words like 仔 is an obvious error; -ian seems to be an inconsistency meant to be -ein. -en in Gene Chin should be our -en. I think this means you're on the right track. — justin(r)leung (t...) | c=› } 02:19, 5 January 2021 (UTC)Reply
@Justinrleung: Thanks for confirming! I just typed it here to have a log of all the inconsistencies I find. Maybe we could incorporate this into the actual project page, as a guide for future editors who will use the same sources. --Mar vin kaiser (talk) 14:05, 5 January 2021 (UTC)Reply
@Mar vin kaiser: I've added Gene Chin's -en and -et because those seem to be consistent with his scheme. I've left Stephen Li's idiosyncrasies out because they seem to be typos and mistakes. — justin(r)leung (t...) | c=› } 17:27, 5 January 2021 (UTC)Reply
@Justinrleung I found another quirk in Stephen Li's source. For the entry 耳後, it shows "ŋɛ55", which I think is "ngie1" in our system. --Mar vin kaiser (talk) 09:57, 30 January 2021 (UTC)Reply
@Mar vin kaiser: I'm actually not sure what it should be. It seems to be irregular, but I haven't found something like it in Gene Chin's or Deng Jun's dictionaries. It could be ngei2, which is one of the pronunciations given in Deng Jun, but I'm not sure if it applies to this word because -ei in Deng Jun is often -i in Stephen Li's dictionary. — justin(r)leung (t...) | c=› } 16:26, 30 January 2021 (UTC)Reply

Incorrect phonetic transcription of nasals as prenasalized voiced stops[edit]

According to Deng Jun, the stop pronunciation is a free variant of the nasal: http://bbs.cantonese.asia/thread-25178-1-1.html (This is his presentation at The Seventh International Bilingual and Bidialectal Conference 第七届双语双方言研讨会(国际).)

In the presentation he claims he had interviewed the informants for the books that gave the prenasalized stop transcription and the informants found it inaccurate. Anecdotally, Hoisanese speakers on YouTube that I've listened to do not have prenasalized stops either, using a pure nasal. A similar case holds for the neighboring dialect of Hoihenese (開平話) as well. Vampyricon (talk) 16:55, 14 November 2022 (UTC)Reply

Mandarin examples?[edit]

The table doesn't say the examples in ( ) in the table are Mandarin. This was confusing since I'm not Chinese and don't know many of these characters, and I expected the examples to be in Taishanese. 93.140.38.69 18:07, 5 June 2023 (UTC)Reply

Thanks for pointing this out. I've removed the Mandarin pronunciations. — justin(r)leung (t...) | c=› } 22:12, 5 June 2023 (UTC)Reply